When we add employees at Harvard's hospitals, we provide them with services like heat, power, light and TCP/IP, always available and in generous supply. We monitor usage and expand the supply accordingly, as would a utility firm.
Over the past year, it has become clear to me that storage must be added to this list. Employees expect their files to be available around the clock in the office and at home. No matter where they are in the world, they expect to be able to access their e-mail, including that 8MB PowerPoint file they sent in 2003. The level of reliability, accessibility and security required by today's computer-savvy knowledge workers necessitates a centralized storage utility.
However, providing a storage utility service on a limited budget can be challenging, because as economist and Harvard President Larry Summers has said, "the demand for a free service is infinite." Although quotas may be an effective way to ensure that employees review and maintain their files, they are time-consuming to enforce.
Our answer has been hierarchical storage management (HSM). Personal files start out on a high-availability, high-speed storage-area network. After a short period, unused files are automatically moved to Serial ATA network-attached storage (NAS) or content-addressed storage (CAS). From there, unused files are moved to tape and archived at a very low cost per gigabyte. We also use business-continuance volumes, snap copies and database shadowing to speed up backup and recovery. Users can automatically retrieve their files from NAS, CAS or tape by clicking on the file name and waiting a few seconds for the restore.
We also archive unread e-mail and old attachments to CAS. This gives employees an essentially unlimited e-mail box.
We're required to maintain all health care records for 30 years, but we aren't required to permanently store e-mail, instant messages or personal files. HSM enables us to implement policy-based archiving and destruction. We can determine not only what gets moved, but also how long it has been saved. We may set a maximum number of years for storage, send out a warning and delete things when the threshold is met. We may also use HSM to identify unusually large volumes of MP3, WAV and MPEG files.
This centralized approach to storage enables us to offer a high-value service to our employees; reduce spending on local storage by using kiosk-type PCs with very small hard disks; enforce business rules on file security, retention and availability; and enhance the reliability of our infrastructure.
But it does have its costs. When failures occur (and they will, albeit very rarely), the impact is substantial. Instead of a single user losing data, hundreds or thousands of people may not be able to reach their files. In my view, risk equals likelihood times impact. With a single desktop hard drive, the likelihood of failure is high, but impact is low. With central storage utilities, likelihood is very low, but impact is very high.
Also, the cost of acquiring and maintaining storage, even hierarchical managed storage, doesn't yet follow Moore's Law (or its storage corollary). Over time, the rate of storage demand is increasing faster than the cost of storage decreases, causing the budget for central storage to rise, slowly and steadily.
However, our experience thus far is that the pros outweigh the cons and that centralized storage is here to stay. For us, storage has truly become the fifth utility.
John D. Halamka is CIO at CareGroup Health System, CIO and associate dean for educational technology at Harvard Medical School, chairman of the New England Health Electronic Data Interchange Network, CIO of the Harvard Clinical Research Institute and a practicing emergency physician. Contact him at firstname.lastname@example.org.