I'll admit it: I'm a data junkie.
That's not just because I'm in the research business -- though admittedly, it's the perfect gig for someone with an obsessive desire to measure, record and track the effectiveness of everything.
No, my fascination with data isn't just a consequence of my day job. It comes from both temperament and training. When I was 9, I measured the relative effectiveness of two processes for shelling peas. (No, I'm not kidding, and yes, I was a really weird kid.)
In my years as an engineer and physicist, I maintained a focus on measurement -- one of my earliest research designs in highenergy physics was a liquid-argon calorimeter, which measures the energy created by a particlephysics experiment.
That's why I'm appalled at the state of Internet measurement now. Even though companies are becoming utterly reliant on the 'Net, we've never known less about Internet structure and performance -- and that's a huge problem.
The best data we have is compiled by the good folks at the Cooperative Association for Internet Data Analysis (CAIDA), a not-for-profit research group run by the University of California at San Diego and funded by government research grants with a handful of high-tech companies (see www.caida.org/home/).
Yet CAIDA's principal investigator and director, K. Claffy, routinely laments the lack of high-quality data her team can access and analyze. We don't even have an up-to-date map of the Internet, let alone a meaningful measurement of its traffic flows. The best available public data is at CAIDA, but it's woefully incomplete.
The problem is twofold: First, scientific researchers, while eager to access others' data sets, are reluctant to release their own. And second, the players with the greatest insight into Internet structure and performance -- the carriers -- are reluctant to reveal their inner workings to each other.
As a result, we lack a comprehensive view of the most sophisticated piece of infrastructure ever created. That's pretty disturbing, given how integral the Internet is to our global economy. Without a complete map of it -- let alone a detailed understanding of its day-to-day performance -- we can't ensure that it will continue to work reliably. If that doesn't scare you, it should.
What should be done? First, enterprises, vendors and service providers should start actively supporting cooperative Internet measurement projects, financially and by providing insight into their challenges. These organizations will benefit individually as well as help ensure the long-term stability of the 'Net.
Second, scientific researchers and the entities that support them (universities, government and industrial research labs) should insist on sharing data sets publicly. CAIDA operates one repository for shared data sets, called DatCat, which is a catalog of Internet measurement data, but there are others. Researchers should be required to make their data sets available through at least one of the public catalogs.
Above all, we should start getting serious about measuring the Internet. The future quite literally may depend on it.
Johnson is president and chief research officer at Nemertes Research, an independent technology research firm. She can be reached at firstname.lastname@example.org.