Can storage systems also become number crunchers? That's what a new research project, launched Tuesday by Silicon Graphics (SGI) and the Pacific Northwest National Laboratory (PNNL), aims to discover.
SGI is helping to fund a year-long PNNL project to write software that will let storage devices do calculations directly, rather than simply serving up files to other computers for processing. The software, which is based on the open-source Lustre file system, could eventually greatly improve the processing capabilities of large scale computing clusters, researchers say.
In most file systems today, data is broken up for storage on the hard drive and must first be recombined before it can be processed, said Scott Studham, manager of computer operations with the PNNL's Molecular Science Computing Facility.
Because of the structure of Lustre, however, this recombination step is not necessary. PNNL researchers are developing software that takes advantage of this fact to harness the processors on storage arrays so that they can perform calculations whenever they are otherwise unoccupied, a process Studham calls "active storage."
"Lustre is the first real production-scale, object-based file system, so we'll be able to go ahead and do the processing inside the file system," he said.
The researchers will use this software to help speed up proteomic research being done at PNNL, by off-loading certain calculations, called Fourier Transforms, to the file system, Studham said.
Active storage could eventually be useful in any area that involves data mining, including the chemical industry, law enforcement, marketing databases, and the insurance industry, said Phil Schwan, chief executive officer of the company that develops Lustre, Cluster File Systems.
Existing data mining applications will need some modification in order to take advantage of active storage, said Schwan, although he does not anticipate that this will be extremely hard to do. "I don't think there's a lot of very special application programming that needs to be done here," he said.
A larger challenge is the fact that Lustre itself is not widely used outside of the high performance computing space. Schwan says his company is working to develop a broader set of Lustre users.
Studham declined to say how much money SGI is contributing toward the active storage project, but it will be enough to fund at least two full-time software developers, he said.
As part of its collaboration with SGI, PNNL recently signed a US$4.9 million contract with the computer vendor to build a 1.3 petabyte Lustre file system at the lab, with an additional 1.3 petabytes of tape storage, he said.
SGI executives weren't immediately available to comment.