Information life-cycle management (ILM) is the policy-driven management of information as it changes value throughout its life cycle. Under ILM, data and data sets will migrate around a storage hierarchy based on a company's storage policy.
The principles of ILM make enormously good sense. ILM will generate better operational practices, consistent compliance and more economical use of storage -- demonstrable business value. But the cross-application management software necessary to make integrated ILM a practical, enterprisewide solution is four or five years away. Waiting for ILM to mature is a poor approach, since the time can be used productively and economically to evolve into an ILM environment.
Here is a six-step recipe for how organizations can proceed in the adoption of ILM. These steps should be taken in order, because each step implies the existence of the prior one, not necessarily everywhere in the enterprise, but rather within the experience of the IT organization.
1. Centralize each data center's storage
ILM is most effective in a networked storage environment where storage management economies on centrally managed pools of storage can have the greatest benefits. Storage-area networks, supplemented with network-attached storage, are now affordable for even small companies.
An increasing number of IT professionals are standardizing on a four-pool storage model for data centers. The pools cover the four types of data storage most often found in the enterprise, including a new form of disk storage called "midline" data. These four pools are:
- Online dynamic data: Online transaction processing and high-activity decision-support systems on Fibre Channel/SCSI disks. This includes structured data such as databases.
- Midline data: Capacity-oriented disks for active fixed content and compliance data. About 25% of the cost of Fibre Channel/SCSI for moderate random access is on data that seldom or never changes. More than half of enterprise data is semistructured or unstructured and seldom changes. With industry projections of storage growth at 45% a year, there is a huge swing coming toward putting reference data on midline disks.
- Near-line buffered data: Today's disk-to-tape backup and restore pool adds midline disk-to-disk backup and restore, compressing backup times by up to 50%. The disk library is the successor to today's tape library.
- Off-line data: No change in off-site sequential tape for disaster recovery. However, those tapes represent an enormous liability in litigation, so make sure your data retention policy is adhered to.
2. Create pools of storage on three axes.
A data classification process will identify and match application data requirements to pools. This type of data is one axis. One way to think about data is as structured, semistructured and unstructured:
Structured data is typically database data, which can be sorted.
Semistructured data is text information, such as e-mail and word-processing documents, which can be searched.
Unstructured data is bit-mapped data, such as medical images, video files and audio files, which can be sensed. A second type of axis is use, which can include frequency of access (active or inactive) and frequency of updating (fixed, changeable). The pools of storage themselves (online, midline, near-line and off-line) make up the third axis.
Systematically analyzing which data belongs in which pool is messy, no doubt about it. But the process will be enormously beneficial to designing a hierarchy of data unique to your specific industry and company.
3. Create life-cycle policies.
IT organizations have to start with existing storage management products and policies, especially with backup/restore software. Then add ILM-oriented, hierarchy-based policies as appropriate. Improving storage asset use and reducing storage administration costs are drivers for this process. Determining best practices for compliance in general and data disposal policies in particular will follow. Creating the policies won't be trivial, because many "interested parties" in the enterprise -- starting with the legal department -- will be involved.
4. Populate new/rehosted applications on appropriate pools of storage.
Key assumptions are that every application platform (hardware, application revision, etc.) changes within five years, but no forced platform migration before its time is possible. So develop standards that mandate the form of pools of storage implementation for new/rehosted platforms. Then wait for each application to reach its five-year "refresh point" where data can be reallocated to the proper pool.
5. Drive economies of scale.
ILM policies will become more sophisticated as virtualization becomes embedded and automation reduces the demands on storage personnel to intervene and resolve problems. The key metric will be a rising number of terabytes that a storage administrator and a database administrator can effectively manage. At this point, growth in Fibre Channel/SCSI should be modest, since the midline pool will have absorbed much of what was on expensive disks.
6. Implement intelligent ILM-based storage.
As cross-application ILM-based policy management software comes to maturity circa 2008-10, expect storage-related hours on production systems to be reduced by 80%, but there will be more storage than today due to growth. Although attempts will be made to accelerate the process, full legal and audit compliance sign-off on automated ILM policy management won't occur before this time.
Peter S. Kastner is co-founder and a member of the board of directors of Aberdeen Group Inc., an industry market research firm in Boston.