BOSTON (03/29/2000) - Hewlett-Packard Co. will contribute US$1.8 million to help the Massachusetts Institute of Technology (MIT) create a digital archive at MIT.
HP Labs will fund the two-year project to develop the archive, which will hold the approximately 10,000 articles produced annually by MIT faculty and researchers, as well as technical reports from MIT labs and centers, according to Eric Celeste, assistant director for technology planning and administration for MIT libraries.
Most if not all of the articles will be published somewhere besides the archive -- for example, in scholarly journals -- but not all of the article's accompanying data will have made it into print, and one of the archive's chief functions will be to house that data for viewing by interested readers, according to Celeste.
For example, academics often have to pay for color plates out of their own pocket, so many plates are unpublished, Celeste said. The archive will be able to preserve those plates, as well as MPEG video clips and other data sets that surround the article, he said.
"We have, for instance, a professor who is gathering MRI (magnetic resonance imaging) images," Celeste said. "We want to make a place at the institution to host that kind of material."
For HP Labs, the archive project is an opportunity to co-develop new technologies in a market of keen interest to HP, according to Stephen Brown, marketing manager at HP Labs. The archive project may not directly yield commercial products for HP to sell, but funding the project gives HP insight into technologies relevant to future products, Brown said.
"You can call it a test bed for a lot of new developing technology in the digital library space," Brown said.
Several HP staffers will be onsite at MIT in Cambridge, Massachusetts, and between now and December, HP and MIT will determine the exact technology the archive will use. Some prototype technologies from HP Labs may be used, especially those related to document formatting, printing and databases, but commercial technologies from HP as well as other vendors will also be used, Brown said.
One major difficulty the archive architects must confront is format obsolescence. Articles and supporting data published today may be in a format bumped into oblivion next year by new technology, and the librarians must decide how best to ensure that today's data is accessible tomorrow.
"That's an enormous problem, and we don't have the answer here yet," MIT's Celeste said. "One thing we'll do, in all likelihood, is pick formats we think will last."
But a format's longevity is no guarantee that documents written in it will always be accessible, given the pattern of releasing different versions of software. For example, many academics write in Microsoft Corp.'s Word, but Word's proprietary technology makes its different versions potentially problematic, according to Celeste.
"Microsoft Word is an incredibly unstable format. You can't predict if it will be forward or backward compatible," Celeste said. Moreover, Word documents are highly dependent on their context, such as the fonts they were created in, he said.
Formats that are easier to handle include TIF images, HTML certainly, and even Adobe Systems Inc.'s PDF, Celeste said. PDF is proprietary, but Adobe has published some of its specs, so "we can have some faith that we can unravel them if we need to over time," he said.
The archive will be introduced as a live service in September 2001, according to Celeste. Within MIT, digital certificates will be used to determine who people are and what level of access to the archive they can be granted, he said. Portions of the archive will be available to the public, but at what level and how that will be controlled has yet to be determined, he said.
Access to the architecture the archive selects, however, will be made available in some fashion to other libraries and institutions, according to Celeste.
"We're trying very much to make whatever the result is openly available," Celeste said.
The archive can be found at http://web.mit.edu/dspace/.