Data size is about to get out of control

  • Mike Karp (Network World)
  • 07 September, 2005 09:43

My friends in Seattle tell me there are two types of weather in that city, and that there is an easy way to tell which is which: if you can't see Mount Rainier, it is raining; if you can see Mount Rainier, it's about to rain.

It occurs to me that Seattle's weather might serve as a pretty good metaphor for data storage. If you are not managing your data, it is out of control and you are likely being inundated; if you are managing your data, the inundation is about to happen anyway.

One group of IT managers in particular is skating on the edge. You are likely a member of this company if the data under your care supports any of the following sorts of projects:

  • A plan for turning analog data into digital data (this might include scanning old CAD drawings or other paper records into a database for archiving, but also likely takes in almost any kind of cataloging project that captures an image).
  • Managing healthcare data such as MRI, x-ray, or other such records that are drawn from a technology that has markedly increased its accuracy lately, and which now provides an appreciable increase in the granularity of its images - and by doing so has significantly increased the size of the data files that are produced.
  • Creating, storing and moving content for HDTV, the files for which are exponentially larger than content for the old NTSC, PAL and SECAM formats that we all grew up with.

The change in the total amount of data under management is an obvious difference of course, but data growth, even "exponential data growth" (no one ever tells us what the exponent is, by the way), is sort of like eating baked beans for lunch - you will have a reasonably good idea what is going to happen, and the more well-considered - and considerate - among you will plan accordingly. I am concerned about something beyond the quantitative issue.

Newer kinds of data - high definition TV content in place of content for older formats, MRI images that are 100 times more accurate than the ones you stored even two years ago - will be used differently and will have to be managed differently than were their predecessors. In fact, I am willing to bet that in each of the situations I mention above, and assuredly in numerous others as well, you will find that the quantitative issues of storing data on disk will pale to relative insignificance when compared to the changes that are made necessary by the ways the new data must be managed.

We look at one aspect of this next time.