The Wall Street Journal recently reported that the Internet, or at least a "chunk" of it, to use the Journal's terminology, was given to the Library of Congress.
It turns out that Alexa Internet, a Web crawler company in Seattle, gave the library an archive of a half million Web pages that adds up to about 2Tbytes of data.
The archive is a snapshot of the World Wide Web taken early last year. The data is housed in a computer rack-like structure with four bright red computer monitors stacked one on top of the other. The monitors display random pages from the archive every few seconds.
Why would the library want such a toy, other than because it's fun to watch?
One of the tasks libraries undertake is that of archiving the times in which they exist. For example, they archive newspapers and sometimes tapes of TV and radio programs. This is so future scholars can get a better idea of the context in which events happened. For a historian, it can be helpful to know what the popular press focused on during the time leading up to a major event, such as the start of a war, and what topics dominated conversation after the event.
This type of archiving proved easy when all the news was in print; microfilms of old newspapers did the trick. But things became more complex with the advent of film, radio and TV. It would be hard to assemble a good history of the Vietnam War without having access to archival copies of the evening news broadcasts.
Archiving the current world is even more difficult. More and more of what affects our lives is now found on the Internet in newsgroups, e-mail messages to mailing lists and Web pages. For example, huge numbers of people have had access to the Starr report via the Web. Some of them might have even read it, rather than just skimmed it looking for the naughty bits. While the report itself was published on paper, the backup material was not. The only way this material existed for most of the world was as bits on the Net.
I will say that archiving digital information can sometimes be difficult to justify. It can be quite hard to see that useful information for analysing current society could come from some Internet mailing lists. For example, a discussion about the evils of spam (the e-mail kind, not the canned meat product kind), approaching a kind of perpetual motion, has taken over the com-priv mailing list.
Then again, in Boston a few years ago it was decided, for the sake of preserving '50s culture, that it was vital to preserve the big neon Citgo sign towering over Kenmore Square.
So, although culture is definitely in the eye of the beholder, the ephemeral Web will have to be part of the archive if future generations are to know what affects our thinking today.
Disclaimer: Compared to Harvard, much of the world has proven to be ephemeral, but the above are my ephemeral observations.
Scott Bradner is a consultant with Harvard University's University Information Systems. He can be reached at firstname.lastname@example.org.