Mapping human genome took computing grunt

Compaq stood tall last week in providing the lion's share of the technology for what may amount to one of the most significant IT projects of 2000: mapping the human genome.

Harnessing the power of more than 200 AlphaServers and a 70 to 80Tbyte database, scientists were able to map 3.12 billion base pairs of DNA that make up the genetic blueprint and the human body. Scientists hope to use the map to study how chromosomes relate to diseases and the way that metabolism takes place, among other things.

Constructing the sequence was no easy task from an IT standpoint. Celera Genomics started designing and building the massive data centre and network around October, 1998. The complex infrastructure consists of a combination of specialised database software and hardware - 300 ABI PRISM 3700 automated gene sequencing machines from PE Biosystems - fed data to Compaq AlphaServers running Tru64 Unix; Intel-based workstations linked to servers running Windows NT; and a Gigabit Ethernet network with more than 200 miles of fibre and 200 miles of 10Base-T copper cabling to tie it together.

Celera also configured its network of AlphaServer systems so they could process as much data as possible by using Platform Computing's load sharing facility software.

One of the reasons that the infrastructure used in the project was so massive was that scientists opted to use the genomes from many people to create a more accurate map.

Many of the workhorse servers doing the job were configured into eight-way clusters. Celera Genomics and Compaq determined the most cost-effective network would be a system-area network - essentially a campus-type network - of clusters of eight servers running across a Gigabit Ethernet network. While using a single system may have eliminated some set-up and configuration issues, distributed clusters were more cost-effective, said Ty Rabe, director of high-performance computing solutions at Compaq.

Marshall Peterson, vice president of infrastructure technology at Celera, said his company chose Compaq after giving IBM, Compaq, and SGI software code to compile and run on its respective high-end servers. The software code was designed to assemble pieces of the tuberculosis genome. Peterson says it took IBM 87 hours to run what it took Compaq seven hours to run.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about Celera GenomicsCompaqIBM AustraliaIntelPlatform ComputingPrismSGI Australia

Show Comments