Compaq Computer Corp. stood tall this week in providing the lion's share of the technology for what may amount to one of the most significant IT projects of 2000: Mapping the human genome.
Harnessing the power of more than 200 AlphaServers and a 70- to 80-terabyte database, scientists were able to map 3.12 billion base pairs of DNA that make up the genetic blueprint and the human body. Scientists hope to use the map to study how chromosomes relate to diseases and the way that metabolism occurs, among other things.
Constructing the sequence was no easy task from an IT standpoint. Celera Genomics Corp. began designing and building the massive data center and network in the fall of 1998. The complex infrastructure consists of a combination of specialised database software and hardware - 300 ABI PRISM 3700 automated gene sequencing machines from PE Biosystems - fed data to Compaq AlphaServers running Tru64 Unix; Intel Corp.-based workstations linked to servers running Windows NT; and a Gigabit Ethernet network with more than 200 miles of fibre and 200 miles of 10Base-T copper cabling to tie it together.
Celera also configured its network of AlphaServer systems so they could process as much data as possible by using Platform Computing's load sharing facility software.
One of the reasons that the infrastructure used in the project was so massive was that scientists opted to use the genomes from many people to create a more accurate map.
Many of the workhorse servers doing the job were configured into eight-way clusters. Celera Genomics and Compaq determined the most cost-effective network would be a system-area network - essentially a campus-type network - of clusters of eight servers running across a Gigabit Ethernet network. While using a single system may have eliminated some setup and configuration issues, distributed clusters were more cost-effective, said Ty Rabe, director of high performance computing solutions at Compaq.
Marshall Peterson, vice president of infrastructure technology at Celera says his company chose Compaq after giving IBM Corp., Compaq, and SGI software code to compile and run on its respective high-end servers. The software code was designed to assemble pieces of the tuberculosis genome. Peterson says it took IBM 87 hours to run what it took Compaq seven hours to run.
Partnering with Compaq was also a result of Compaq's willingness to lend their engineering expertise to the project. "They were willing to help us optimise our code for the architecture," he says.