While vendors and analysts argue the fine points of what information lifecycle management (ILM) means, some customers are just doing it and seeing impressive cost savings and productivity improvements as a result.
There are big differences among implementation strategies, the problems these companies face and the technologies they use. But what's similar is that each organization listens carefully to its users and crafts an ILM strategy to meet its business needs.
ILM refers to a set of policies for storing, managing and retrieving information based on the changing value of the information to the business. By storing only the more critical, time-sensitive information on higher performing but more expensive storage, proponents say, ILM allows storage managers to provide users with better information access at lower cost.
ILM is similar to other familiar concepts, which include:
- Hierarchical storage management, which involves moving data to slower, less expensive drives, usually based on the age of the data or how often it is accessed. ILM backers say it is more advanced because it takes into account many strategic factors, such as the value of the data to the business.
- Tiered storage, which refers to a storage architecture that deploys different types of storage (such as high-speed Fibre Channel disk drives, lower-speed Serial ATA-based disk and tape libraries) to store different types of information. A tiered storage architecture is generally considered to be a prerequisite for implementing ILM, but it is not in and of itself an ILM strategy.
- Data lifecycle management, which is a related concept to ILM. Some ILM backers say data lifecycle management is more limited than ILM because it focuses on the movement of data to protect it from loss or to free up space on storage hardware rather than to satisfy strategic business requirements.
However it is known, ILM is driven by the need to reduce the cost and increase the efficiency of providing information to users. This common driver can be seen through various examples of companies that have implemented ILM.
For instance, providing cost-effective data access is strategic for AmeriVault, an online data services provider in Waltham, Mass. It began looking into ILM about a year and a half ago when it realized much of the 35 terabytes of the customer data it stored was comparatively static and accessed so infrequently that it could be safely relocated to somewhat slower, lower cost ATA-based RAID storage.
According to Kevin Harris, chief technology officer at AmeriVault, the company's customers would see no performance loss under this new storage architecture, and the company could remain competitive while still providing offsite backup less expensively than if it continued to do backup locally.
Costs were also the driver for State Street Global Advisors, the investment research and trading arm of State Street Corp. In the late '90s, the firm built a SAN infrastructure that met its stringent requirements for data recovery and data availability. However, the space required for replication and backup, as well as the fact that the SANs couldn't share excess capacity, meant that 35% to 40% of the expensive SAN capacity went unused. By 2002, ever-rising demands for storage (and the resulting spending on storage hardware) forced Lars Linden, principal at the company, to find a better way.
Meanwhile, accessibility and high data volumes were the clincher at MLB Advanced Media, the online component of Major League Baseball. It built a storage architecture to capture, track, repackage and sell video and audio of Major League baseball games to customers such as cable TV networks, says vice president and chief architect Justin Shaffer.
The 2,500 Major League games played each year result in about 200 terabytes of new data, all of which needs to be easily accessible by the league's production staff for repackaging as, say, a DVD of World Series greatest plays. As high-definition TV broadcasts become more popular, Shaffer's storage needs will jump by at least a factor of two, he says.
At Direct Media, it was e-mail that propelled the company toward ILM. The Greenwich, Conn., firm sells mailing lists and other tools for direct marketing campaigns, and brokers at the 200-person company rely on e-mail to send customers samples of direct-mail campaigns, reports on the number of names provided to them or descriptions of the lists from which those names were drawn.
Over time, the e-mails a broker has sent to customers become a vital source of information as Direct Media sells lists to new customers, says Kevin Ladd, director of infrastructure.
But 2 years ago, that e-mail had grown to 180GB of .PST files that soaked up 70% of the available DAS space on Direct Media's e-mail servers. As a result, brokers found their e-mail freezing as they hit the storage limits on their servers. In addition, .PST files became corrupted as they grew beyond a gigabyte, and it became hard to back up files during the nightly backup window.
If pressed, the brokers might have been able to delete some of their old e-mails, Ladd says, "but it's a waste of their time to do that, and it interferes with their preferred way of doing business."
Sometimes companies pursue ILM for one thing, and it evolves into something else. NetBank in Alpharetta, Ga., first tackled ILM three years ago, primarily as a way to save money, says Todd Warnock, director of technology services. "But in the last two years, we shifted our focus and said ILM is really technology-agnostic, and it's more of a business issue, relating to how we deal with this data." The bank's ILM strategy has moved beyond storage to creating policies for data and record management, whether the data is stored on paper or in digital form.
Classifying the data
In each of these examples, the first step was to classify data into various tiers, each of which was to be stored on different classes of hardware. AmeriVault, for instance, has always separated its data into two simple tiers: static and dynamic. But by using VisualSRM storage resource management software from EMC to track how often its vaulting application read each of the files under its management, the company found that far less data then expected --15% to 20% -- was truly dynamic, meaning it didn't need to be on the more expensive, higher performing Fibre Channel SAN.
As a result, the company was able to cut storage spending by 60%, even while its data under management has grown by three to four terabytes per year.
At MLB, meanwhile, classifying data "wasn't that difficult" because it was determined by the company's existing processes for video production, Shaffer says. These requirements dictate, for example, that video for 15 games per night must be immediately available for the production staff, after which it can be stored on tape and accessible within five minutes.
While many companies start out with five or more tiers of data, most find they can narrow that to three, says Dennis Hoffman, vice president of marketing for the EMC Software Group. That was the case for Direct Media, whose three tiers consist of the following:
- E-mails that are less than 30 days old, which are stored online for immediate access.
- E-mails between 31 days and three years old, which are stored "near online," using Veritas's Enterprise Vault content archiving software and accessible within 10 seconds.
- E-mails that are older than three years. These are stored in a tape library. It was mainly technical reasons that dictated the formation of these tiers, such as the fact that they would result in reasonably sized files on which the servers could perform offline integrity checks. But users are also comfortable with the architecture, especially since there's only a five-second difference in access time between data that is less than a day old and data that is 32 days old.
Six was the magic number for State Street Global Advisors, which conducted a detailed analysis of data usage and the cost to provide that storage to customers. It created six tiers of data stored on various technologies, including content addressable storage for files that are never changed once they are stored. With this setup, State Street is now able to meet its users' availability and recovery needs with less than 10% excess capacity, compared with as high as 35% to 40% before ILM.
At NetBank, creating an ILM architecture involved an examination of how much of the company's data really needed to be retained, and for how long. The analysis resulted in two tiers of data access (online or offsite on tape) and four periods of data retention, including very short-term, three years, seven years and forever. The new architecture has enabled NetBank to reduce its tape backup costs by 70% and cut the time and expense of producing reports needed for regulatory compliance.
Implementing ILM across the entire business all at once "could be overwhelming," warns Harris. He suggests conducting data analysis in stages and in close consultation with users.
Customers should start with their most important data or data that is causing them the "greatest amount of pain," says Carolyn DiCenzo, a vice president of research at Gartner. "This gradual approach gives storage managers some experience and some success they can share with their managers. They can then apply that to the next application or the next set of data," she says.
The time to ask users to classify their data is when they first ask for storage space, says DiCenzo. Questions to ask include, how long must the data be kept, how quickly should it be retrieved, how often should it be accessed and when can it be deleted or moved to archival storage.
Some business managers may find it hard to assign business value to their data, but increasingly, they are being spurred by the use of chargebacks to business units that keep their data for longer periods of time or that store it on high-end hardware that provides instant faster access.
"As you tell people, `You're going to pay for storage', the logical question they'll come back with is, `How do I reduce my costs?'" says Warnock. "You reduce your costs by not keeping stuff you don't need to keep."
What's the best strategy when users insist, as they often will, that they need instant access, all the time, to a certain set of data, regardless of the cost? Hear them out, says Harris. "Find out what they think they're doing and what they think their needs are, and indicate to them you're going to run some analysis on the data to take a look at what really happens, and you would be happy to share the results with them, so that together you can evaluate a solution that works," Harris says.
In other words, don't dictate storage rules. Work with your storage users to create an ILM strategy that makes sense from a technical, and a business, standpoint.