What once was old is new again with policy-based storage management software. It could change the way companies think about how they store, archive, back up and recover corporate data.
This new breed of storage software takes the concepts introduced with traditional hierarchical storage management (HSM) tools and extends them. Not only will this software increase the utilization of storage, reduce back-up and recovery times and archive data, but it also will manage the hierarchy of data from its creation.
With this automated storage management software, IT managers will no longer need to devote hours - if not days - to moving aging data from expensive, enterprise storage arrays to less-expensive, near-line tape or offline, archival media. They won't have to write scripts to identify what data needs moving, or groom disks constantly to eliminate files that are no longer required by law or company policy. And gone will be the days spent culling data from disks because drives have reached their capacity or redistributing data to other disks to achieve maximum use.
That's not all. Automated storage management software extends the reach of traditional HSM tools by looking at data in use and, following set policies, making decisions on how applications or business processes are tied to the data.
Such tools will become increasingly desirable as users struggle to cope with unwieldy - and costly - data stores, says Jamie Gruener, senior analyst with the Yankee Group. The Yankee Group estimates the market for automating the provisioning of storage for apps at US$500 million by 2005.
"The fact that storage is such a growing part of the IT budget will force customers to find ways to track storage automatically [starting] at the application and business level," Gruener says. "Once a customer knows the storage requirements each specific application needs, rolling out applications can become an automated task, saving precious personnel time spent on doing this function manually."
Automation by application works in provisioning, resource management, back up and recovery, and the archiving of data.
New York law firm Weil, Gotshal & Manges sees promise in such tools for archiving e-mail attachments, for example.
"Being a law firm, a lot of stuff that used to come by FedEx (Corp.) is now coming by e-mail," says Steve Kedem, senior network engineer with the firm who manages 8 terabytes of data on Hewlett-Packard, Hitachi and MTI storage. "The volume of data we are getting is growing exponentially . . . because of attachments," he says, noting that data is replicated to other locations as many as four times for fault tolerance.
And Kedem sees no end in sight. He anticipates the amount of stored data will only increase because law firms may be required to archive mail for seven years. So the firm is evaluating products that would let it separate attachments and store them centrally, thus saving disk space, Kedem says. He is looking at policy-based e-mail archiving products from FalconStor Software, IBM, Legato Systems and Sun that migrate messages to tape after storing them on disk for a specified amount of time.
"We are looking at software now that will not only consolidate [mail] but automatically archive it. The key is to make it all transparent to the user," Kedem says.
Traditional HSM, while automating data backup, recovery and archiving, doesn't touch data that is still being managed as part of an application, even if that data hasn't been accessed in a certain amount of time. HSM can't predict that a disk drive failure will interrupt an application. It can't retain data from a specific application based on rules that say it must be archived on nonrewritable media because of a federal regulation.
Brent Hawkins, network administrator for Earl Walls Associates, an architectural and engineering firm in San Diego, knows the limitations of traditional HSM. He uses Computer Associate's HSM for NetWare and CaminoSoft's Highway Server software to store files associated with completed laboratory design projects. But he'd like greater flexibility in the type of files migrated.
"We have a volume with thousands of project directories. Under these projects is a set of directories, with one being an 'archive' directory. With the CA software, we set up a policy to migrate files within this directory only," he says.
Now Hawkins is looking into new policy-based storage management tools that would set policies for moving data based on project status.
HSM hasn't evolved in the open systems arena, either. In this environment, when more storage is needed, IT managers tend to buy more inexpensive disk arrays. They don't cull data out or migrate it to less-expensive devices.
Even Kedem is tempted by the allure of inexpensive disks.
"Disk is becoming so cheap, it is almost worth it to keep it on disk," he says.
But buying more disks compounds the storage management problem. As disk capacity grows, management headaches increase. Managing all the storage and making good decisions on where it should reside is getting too complicated and taking too much time to do manually. Data that hasn't been accessed in a year is being kept on enterprise arrays, taking up space that data generated by business-critical applications needs.
"Most shops put data on a box, and it stays there effectively forever, whether or not the value of that data merits being on that resource any longer," says Steve Duplessie, senior analyst with Enterprise Storage Group.
IT managers stop only long enough to back up that data to tape, snapshot it to another disk or replicate it to an expensive disk in another location.
"Rarely do you see any true proactive management, where shops scour the data they have on devices and move it to a smarter spot," Duplessie says.
Now with automated storage resource management and provisioning, IT managers can apply rules and thresholds to capacity and utilization monitoring and provision storage according to application need.
Products from start-ups such as AppIQ, Invio Software and CreekPath Systems examine how frequently data is accessed, the importance of the data to the business, and the time it takes to retrieve it. With these packages, a network manager could specify that a mission-critical Oracle database be recovered before any other application in the event of a failure.
Alternately, network managers could set policies granting more bandwidth for backups to mission-critical applications. CreekPath is one of the only start-ups shipping such a product. Its AIM software defines three types of policies: explicit, rules-based and constraint. Explicit policies are user-definable rules affecting specific parameters such as RAID levels. In an explicit policy, a volume might be mirrored remotely to another device to ensure availability. Rules-based policies invoke a specific user or system action when an event occurs. For instance, AIM will notify administrators when a drive is underutilized so they can shift storage to it. Constraint-based policies put limits on a device - if the device is used, it also must be mirrored to a remote array for fault-tolerance.
But storage management companies such as BMC Software, CA and IBM Tivoli are getting into the game, too. BMC offers automated storage management with its Application-Centric Storage Management initiative, and CA and EMC promise such capability by year-end. Meanwhile, Tivoli has promised to announce a policy-based storage resource management package later this monthFast recoveryAutomation is being applied smarter than ever before to the back-up and recovery process, in an effort to address rising concerns about business continuity borne out of the Sept. 11 tragedy. In a recent survey of 96 network professionals attending Network World's summer seminar tour "Storage Town Meeting: Ensuring Business Continuity," 41 percent said they are currently upgrading or plan to upgrade their business continuity plans, while 24 percent said they revised their plans within the last six months.
Start-ups such as Avamar Technologies are developing policy-based software that saves data to inexpensive disks rather than tape, thus speeding the back-up and recovery process and shrinking the amount of time a customer must reserve for backup. Other vendors that have jumped into this arena include Atempo, EMC, Legato, Network Appliance, OTG Software, QLogic and Quantum with products that replace tape with inexpensive Advanced Technology Attachment drives from which data can be recovered more quickly. Companies claim data is restored from disk as much as 100 times faster than tape.
Connected, a PC management and data-recovery company in Framingham, Mass., which backs up about 1 million computers for several large clients, has turned to EMC's Centera hardware and software for automated storage operations. The company is replacing tape drives with Centera's inexpensive disk and using policy-based software to migrate data from primary Symmetrix storage to Centera after it reaches a certain age, says Tom Hickman, engineering operations manager.
"The data is stored first on Fibre Channel-attached storage," he says. "The server has a threshold control [which is set in software] - say you allocated 100G bytes to store data on. When capacity reaches 85 percent, the migration utility sends the data automatically to a tape library or to the Centera."
Setting policies and then letting Centera and its software act on them saves Hickman from having to perform the tasks manually.
Automation is even coming to the process of back-up reporting. BackupReport software from start-up Bocada gathers completion records from CA,HP, Legato, Microsoft and Veritas back-up software running on a network and compiles them in a single set of reports. BackupReport costs $500 per server, and could save IT managers considerable time.
Some large companies run as many as 40 back-up packages, vendors say.
No doubt, policy-based automated storage promises to ease the management of rapidly growing and out-of-control media. When automation hits storage, users will have the luxury of setting policies and letting the system do the work.
Considerations for data migration and retentionAnalysts recommend that corporations decide when to migrate data to less-expensive disk or tape storage based on a number of factors:
-- Application type. You probably need to keep mission-critical databases online for years.
-- Age of the data. If no one has accessed data for six months, consider migrating it to less-expensive media.
-- Data type. Consider moving data such as digital X-rays that doesn't change but is still needed - what EMC calls static content - to less-expensive disk media.
-- Laws regulating data retention. Consider if your company needs to archive e-mails or accounting data because of SEC rules or other regulatory requirements.
-- Network considerations. Examine the cost per megabyte of storage, retrieval speed and capacity limits.