Wednesday | 3 December, 2008
Homegrown high-performance computing
High-performance computing enters the reach of today's enterprise
Leon Erlanger (InfoWorld) 26/04/2007 12:12:14

Virginia Tech starts from scratch

At Virginia Tech's Advanced Research Institute (ARI), constructing an HPC cluster for cancer research has been an educational experience for the electrical and computer engineering grad students involved.

With little prior HPC experience, the students built a 16-node cluster and parallelized apps they had written in MATLAB, a numerical programming environment, over the course of several months. The project taps huge amounts of data acquired from biologists and physicians to perform molecular profiling of cancer patients. The students are also working on vehicle-related data for transportation projects.

Rather than make every aspect a learning experience, when it came to choose an HPC platform, the students and professors decided to stick with what they already knew: Microsoft Windows.

"Our students had already been running MATLAB and all their other programs on Windows," says Dr. Saifur Rahman, director of ARI. "We didn't want to have to retrain them on Linux." As was the case at BAE Systems, there were also obvious advantages to a cluster that could integrate easily with the rest of ARI's Windows infrastructure, including Active Directory.

Microsoft had already approached Virginia Tech to be an early adopter of Windows Compute Cluster Server 2003, so Dr. Rahman and his team said yes and started looking for the right hardware. They vetted several vendors, but when they found out Microsoft was performing its own testing on Hewlett-Packard servers, they decided to go with HP. "We knew we'd need help from Microsoft to fix various bugs," says Dr. Rahman, "and since all their experience was on HP servers, we felt we'd have the most success with HP."

So with help from Microsoft and HP, ARI installed 16 HP ProLiant DL 145 servers with dual-core 2.01GHz AMD Opteron 270 processors and 1GB of RAM each. On the same rack, ARI installed 1TB of HP FC storage. The rack also includes one head node, as well as an HP ProLiant DL385 G1 server with two dual-core 2.4GHZ AMD64 processors and 4GB of RAM.

As did BAE Systems, ARI decided to stick with Gigabit Ethernet for its cluster interconnect, mainly because it was what the team knew. "There are other interconnects that are faster, but we've found that Gigabit Ethernet is pretty robust and works fine for our purposes," Dr. Rahman says. And after some servers overheated, ARI placed the entire cluster in a 55-degree Fahrenheit chilled server room.

ARI found parallelizing MATLAB apps to be a significant challenge requiring a number of iterations. "The students would work on parallelizing the algorithms, then run case studies to verify the results they were getting with the clustered applications were similar to results they got when they ran one machine," Dr. Rahman says.

At first, the results weren't coinciding, and the students had to learn more about how to parallelize effectively and clean up what they had already coded. "We missed some important relationships at first," Dr. Rahman says. With some help from MATLAB, it took two graduate students about a month to get the app parallelization right.

Dr. Rahman feels that the team's diverse expertise was a large factor in the project's success. One of the grad students had deep knowledge of molecular-level data quality, biomarkers, and the relevance of different data types; another offered a lot of hardware expertise; and the IT person had much experience interacting with vendors effectively. MATLAB provided help in determining which toolboxes were relevant to the task.

"When we went to MATLAB, they were just getting started with HPC," Dr. Rahman says. "I hope they will start to pay more attention, as it would be nice if they were all ready so we didn't have to spend months on this."

There were also hardware communications glitches.

"At first we had some problems controlling the servers as they talked to each other and the head node," Dr. Rahman says. "Sometimes they wouldn't respond. In other cases we wouldn't see any data coming through." Solving the problem took a lot of reconfiguring and reconnecting. "Perhaps we were giving the wrong commands at first. We're not sure," he adds. There were also problems with incorrect server and software license manager configurations.

Dr. Rahman says that managing the cluster has been relatively trouble-free with Windows Compute Cluster Server 2003 and adds that if he could do this all over again, he'd send his students to Microsoft for a longer time to learn more of what Microsoft itself has discovered about building clusters with HP servers. The use of HPC has enabled ARI researchers to dive much more deeply into molecular data, not only analysing differences in relationships among disparate classes of subjects, but also revealing more subtle but important variations within each class.

Computerworld Buyer's Guide - Vendors Matched to this Article
Computerworld Buyer's Guide - Vendors Matched to this Article
Additional Resources
Executive Guides
Whitepapers
Zones
Zone logoZones provide focussed content from Computerworld and leading technology partners.
Newsletter Subscription
Sign up for our Computerworld newsletters!
RSS Feeds
Market Place

 

Smart SOA World Tour

Discover how SOA can create smarter outcomes for your business.

Attend and learn:

  • How SOA is helping leading companies to become more agile
  • Where you should be applying SOA processes in your company
  • The top SOA implementation mistakes to avoid

Click here for more information.
Whitepaper

Best Practice in Building an Integrated Information Management Strategy

Discover the business value that creating an integrated information platform can bring. Learn how to provide consistent, accurate information to all stakeholders within your business network. Integrate vital data from disparate sources and deliver a trusted information foundation. Read on to uncover the stepping-stones to your new information management strategy.

Enterprise IT Buyer's Guide
Find Technology Vendors Fast
 
Find vendors by name | Find by category
Sponsored Links