Hewlett-Packard Co. and Cluster File Systems Inc. have been tapped to improve and adapt a file system software for the U.S. Department of Energy (DOE) that will help manage large numbers of Linux servers, the groups announced Thursday.
HP, based in Palo Alto, California, will supply hardware and engineering expertise to help tune Cluster File Systems' Lustre software in a three-year agreement with the DOE's National Nuclear Security Administration (NNSA), which already has clusters of low-cost servers in place. The Lustre file system was designed to tackle the problem of reading and writing data across thousands of servers and storage systems in a cluster. The open source software, which may one day evolve into commercial applications, will be deployed at Lawrence Livermore National Laboratory, Los Alamos National Laboratory, Sandia National Laboratories and Pacific Northwest National Laboratory on HP systems, said Kathy Wheeler, high performance computing systems program manager at HP Labs.
"Currently, the customers for tightly-coupled clusters are in high-performance computing," Wheeler said. "But already given the price performance of these systems, there is growing interest in the enterprise world to adopt and use this type of system."
Building clusters by linking thousands of relatively low-cost servers makes it possible to generate the type of computing power found in a more expensive supercomputing system. But managing the movement of data across the clustered servers and storage systems, which sometimes number in the thousands, has proved challenging.
The Lustre -- a combination of Linux and cluster -- file system addresses some of these management concerns with its use of OBFS (object-based file system) technology. OBFS-based software is designed to speed up the way clustered server and storage systems read and write data, Wheeler said.
The technology requires the use of intelligent disk drives that, for example, know when to back up a file or send its contents to another drive. In addition, data would be spread -- a process also called data virtualization -- across all of the hardware in a cluster instead of being locked into individual servers. This means that changes made to data in one server are reflected immediately in the entire cluster, thus keeping the data uniform for all concurrent users.
Work on Lustre was started at Carnegie Mellon University but has been further developed by several organizations over the last three years. Cluster File Systems, HP, Seagate Technology LLC, and the NNSA have all contributed to the software, Wheeler said.
An early version of the file system called Lustre Light will be rolled out across 100 servers next month at the Lawrence Livermore National Laboratory, Los Alamos National Laboratory, Sandia National Laboratories and Pacific Northwest National Laboratory.
The organizations involved will then continue to improve the technology over the next three years. The software will be released under the GPL (General Purpose License) open source license, Wheeler said.
HP, Cluster File Systems and the NNSA will work to add more failover and security features to the file system along with advanced management tools. Eventually, the technology could be used by companies for large-scale data mining tasks or even for rendering animation, Wheeler said.