In an effort to ease the load on its main Alpha-based supercomputer, and to add more processing power to the existing 100-node Intel cluster, the Australian Partnership for Advanced Computing (APAC) has ordered another 50 Intel computers which it will bring online using in-house skills.
Dr Bob Gingold, head of the Australian National University supercomputing facility and manager of the APAC national facility, said the decision to go with single-processor, off-the-shelf hardware was the result of a benchmarking and cost assessment.
“We did extensive benchmarking into the amount of processing power gained by having dual-processor nodes,” Dr Gingold said. “The amount you gain out of the second processor may be as little as 1 per cent, but on average it’s about 30 per cent. This is mainly due to memory limitations.”
Gingold said the best “bang for buck” is achieved with single-processor nodes as Intel Xeon systems which have two processors are more expensive.
APAC originally went to tender earlier this year and purchased 100 Dell Pentuim 4 towers which it has arranged on shelves.
“Dell gave us a good deal and the cost of the cluster was not much more than a couple of hundred thousand dollars,” Gingold said. “Since space wasn’t a concern for us we didn’t need rack-mounted servers. I’m very pleased with the work my team did in arranging the towers and cabling. It only took a couple of weeks to assemble.”
With the main cluster in place and running at some 530 gigaflops, Gingold is preparing to add another 50 identical nodes to the cluster this month.
“We will be getting the next batch cheaper as that model has been superseded,” he said. “It works out less than 50 per cent of the cost of the 100 nodes. However, we do expect a 50 per cent increase in performance.”
The cluster – which is used exclusively for scientific and engineering research – runs Linux and other open source software.
“We haven’t experienced any serious problems with Linux and openpbs [portable batch system], which we use for job scheduling, is highly successful,” Gingold said. “The system is meant to be used by top-end users who are Linux savvy.”
Although the cluster is not designed for high availability, Gingold is confident it can perform reliably.
“Machines these days are reliable enough so we need not to be too concerned about whether a node goes down,” he said. “If a node did go down it would interrupt that job, but not affect the rest of the cluster.”
So far, the only hardware failure experienced was a disk failure, Gingold said.