Case Study: Sandia tests out InfiniBand clustering
- 12 December, 2002 07:11
Sandia National Laboratories' Curtis Janssen needed an inexpensive and fast technology to cluster the Intel servers in his laboratory, which were used to run high-performance computing applications. When he heard about InfiniBand, the new high-speed bus technology, he decided to evaluate it as a replacement for slower, more expensive and proprietary clustering technologies the lab had been using to create supercomputer clusters.
InfiniBand is a switched-fabric I/O technology that can be used to interconnect servers, storage and network devices more efficiently than traditional bus technologies, such as PCI or PCI-X. That's because InfiniBand switches and routers provide features such as direct memory-to-memory connections for clustered servers at speeds as high as 30G bit/sec. Current clustering technologies, such as a popular and proprietary interconnect from Myricom called Myrinet, operate at only 2G bit/sec.
Janssen, a team leader, is responsible for evaluating emerging technologies at Sandia, a laboratory co-headquartered in Livermore, Calif., and Albuquerque, N.M., and owned by Lockheed Martin. The lab performs research and development for the U.S. Department of Energy National Nuclear Security Administration and has 10,000 employees.
At present, some of the clusters in the Janssen's organization are interconnected using Myrinet, which eliminates the latency and overhead of the PCI bus. It has host bus adapters that offload processing from the host CPU and communicate directly with the network to decrease latency.
"Right now we have a pretty big cluster in Livermore and another in Albuquerque that are based on Myrinet," says Janssen. "That's a technology for the very high end. If you look at our last big supercomputer acquisition, ASCI Red, it uses the Myrinet custom interconnect." ASCI Red consists of almost 9,300 Intel Xeon processors.
He was looking for a 10G bit/sec standards-based and less expensive interface for his Intel-based servers, which run I/O-intensive material science and quantum chemistry applications to compute the properties of atoms and molecules.
Myrinet wasn't suitable because it was slow and expensive. A cluster made up of a Myrinet switch and eight adapters starts at $16,300, compared to $15,000 for a similar InfiniBand network that runs at five times the speed.
Janssen installed a test bed consisting of a Paceline InfiniBand switch and attached it to eight servers, each containing PCI-based Host Channel Adapters. While Janssen initially is clustering only eight Dell PowerEdge 2650 servers running Linux with InfiniBand, "we hope sometime to scale up to a larger number so we can get a look at how the applications grow and run on larger clusters," he says.
His first step is to get working a communications technology used in high-performance computing called the Message Passing Interface (MPI). MPI is middleware commonly used with high-performance applications for speeding communication between distributed servers.
Janssen says once MPI is running efficiently, Sandia will look at connecting storage devices to the cluster and seeing what performance benefits they get. "We don't have any immediate test plan for that, but it will be important for our applications as we get the InfiniBand clustering portions stable," Janssen says.
Janssen doesn't expect to replace the Ethernet connections to desktop computers with InfiniBand, but does hope that at some time the high-speed bus will replace the Fibre Channel connections in the organization's storage area networks.
As for whether Janssen approves of using the servers with existing PCI buses instead of InfiniBand-native servers, he says, "I'd like to see the PCI bus replaced, as it adds a lot of overhead processing, so a native implementation would give the highest performance. But we don't see any [server vendors doing that] right now. So we are going to go with PCI-X or the upcoming PCI Express and use an InfiniBand switch."