IBM Corp. is teaming up with a North Carolina organization to build a large computing "grid" across the state that will give life sciences researchers more computing power.
The North Caroline Bioinformatics Grid (BioGrid) will let thousands of researchers and teachers at universities, pharmaceutical companies, and research institutes share information and number-crunching power. The goal: making sense of the huge amount of data being generated by research into human, animal and plant genomes.
"This will involve a teraflop supercomputer and what we envision to be a petabyte (1 million gigabytes) of storage," said Caroline Kovac, general manager of IBM Life Sciences. The data in just one of the major public databases on gene research, the U.S. National Institutes of Health's Genbank, doubles every six months, said Thom Dunning, vice president of high-performance computing and communications at MCNC, a state IT development group formerly called Microelectronics Center of North Carolina.
IBM is cooperating with MCNC and the North Carolina Genomics and Bioinformatics Consortium (NCGBC) to build the grid. Members of NCGBC include the University of North Carolina, Duke University and the National Institute of Environmental Health Sciences, as well as private companies including GlaxoSmithKline Inc., BioGen Inc. and SAS Institute Inc. The partners will work for 12 to 18 months to build a test bed and hope to complete the full project within three years.
Computer grids take advantage of IP (Internet Protocol) networks to lash together computers and storage systems at various sites for the use of many related companies and institutions. The idea is to make those resources available to users all over the grid transparently, as if they were located in the same building. To allocate the available resources to the right users at the right time, across a variety of computer platforms, they use advanced scheduling and management software. The international consortium The Globus Project is working to standardize practices for sharing and securing information across a grid.
Some grids are already in use, especially in Europe, such as a "National Grid" in the U.K. that IBM is building in cooperation with Oxford University. However, the North Carolina grid appears to be the first large grid project with major involvement by private industry, said Anne MacFarland, an analyst at IT consultancy The Clipper Group, in Wellesley, Massachusetts.
"(Grid technology) works fine, but the question always was, does it translate over to the commercial space, where people are competing vigorously?" MacFarland said. Gene research is a natural application for the technology, because the IT investment required in this field is hard to come by for one company, she added. "At the research stage, collaboration has time-to-market benefits for everybody."
The project will drive further development of grid technology that can be applied elsewhere, IBM's Kovac said.
"We'll learn a tremendous amount about the products . . . that will be our future offerings," she said.
Other industries, such as aeronautics, insurance, mass transit and defense, also might benefit from grid computing, Clipper's MacFarland said, but she cautioned that more work is needed to address privacy and security concerns.
"Whenever you share (data), you have to have a capability to isolate," she said.
The BioGrid will include an IBM eServer p690 that can process 1 trillion operations per second, according to an IBM statement. The grid also will use the IBM Enterprise Storage Server mainframe platform and Tivoli Storage Manager software, as well as IBM's DiscoveryLink data integration technology, which can integrate data from various sources, formats and file types into a virtual database. Servers and storage systems from most major vendors can be integrated into the grid, IBM's Kovac said.
Much of the connectivity for the grid will be provided through the North Carolina Research and Education Network. The network is undergoing an upgrade that will bring the bandwidth on the biggest trunks between major facilities to OC-192 (10G bps) by year's end. When the upgrade is finished, the slowest link on the network will be OC-3 (155M bps), Dunning of MCNC said.