The most well-known grid computing implementation—as well as the world's largest distributed computer—is that of the Search for Extraterrestrial Intelligence project (SETI@home).
A brainchild of the University of California, Berkeley, SETI@home uses idle, Internet-connected PCs all over the globe to supplement the work done on a supercomputer at the Berkeley campus [QuickLink a2800]. SETI@home has raised awareness of grid computing, but it has also relegated it to the realm of science fiction in many people's minds.
On the contrary, thanks to advances in grid computing's underlying technology, businesses can use their networks to undertake complex computing tasks such as designing machinery and performing what-if scenarios based on vast financial databases. Someday, it may be possible for grid computing service providers to create virtual supercomputers and rent processing time to businesses anywhere in the world.
Grid computing works by distributing computational resources but maintaining central control of the process. A central server acts as a team leader and traffic monitor.
This controlling cluster server divides a task into subtasks, then assigns the work to computers with surplus processing power on the grid. It also monitors the processing and, if the subtask routine fails, it will restart or reassign it. When all the subtasks have been completed, the controlling cluster server aggregates the results and advances to the next task until the whole job is completed.
In a grid campus, a hierarchical structure of many grid servers may handle subtasks, but all processing occurs on a single network.
In a global grid, machines can be on many different networks and on the Web. Because they're processing in so many different circumstances, network latency can be a problem. But before any processing can occur, available resources must be identified and located. Access to them must be negotiated, and the hardware and software must be configured to effectively use the resources, which often are many smaller computers.
As the trend toward larger storage capacity and faster processing power continues unabated, scientists are planning to experiment with petabyte data archives in a few years. And industry giants such as IBM, Microsoft Corp., Oracle Corp. and Sun Microsystems Inc. are racing to develop grid computing strategies.
"There's been a tremendous interest in grid computing from the commercial side," says Ian Foster, a director of the Globus Project of Argonne National Laboratory and a computer science professor at the University of Chicago. "Initially, the interest is in using technology as a means of making more efficient use of computing resources."
Foster says advances in three areas are driving interest and development in grid computing: ubiquitous connectivity via the Internet, the dramatic increase in network performance and the development of collaboration tools and the acceptance of collaboration as a viable way to work.
But pinning down what is and isn't a grid can be a knotty question. An argument can be made that all network computing is a form of grid computing. Should the definition derive from size, purpose, architecture or some other criteria? Foster has proposed a grid checklist. For an aggregation of computers to be a grid, he says, it must do the following:
• Coordinate resources that aren't subject to centralized control.
• Use standard, open, general-purpose protocols and interfaces.
• Deliver quality service.
"The creation of large-scale infrastructure requires the definition and acceptance of standard protocols and services, just as the Internet protocol TCP/IP is at the heart of the Internet," Foster says.
The Globus Project and the Global Grid Forum are working on defining those standards and protocols. The project also researches and develops ways to apply grid computing concepts to scientific and engineering computing.
The open-source Globus Toolkit, developed by the Argonne National Laboratory and the University of Southern California, provides security protocols and services such as resource discovery, resource management and data access and software libraries that support grids and grid applications.
Several groups from science and industry are working on standards, Foster says. They share many of the same concerns. "Every site is concerned about security, about protecting their data. Everyone is concerned about network management costs and interoperability," Foster says.