Computerworld

A new kind of backup

We’ve come to depend on applications that can undo everything they do. In Microsoft’s Word, for example, you can reverse a series of edits by leaning on Control-Z, and you can restore them with Control-Y. But when you save the file, your undo stack evaporates and the last edit wins. If you accidentally delete the file, you might be out of luck because the Recycle Bin only works when applications (primarily Internet Explorer) use the Windows shell API. If you accidentally overwrite the file, you’re almost certainly out of luck.

Estimates of the cost of recovering from these kinds of incidents range from high to astronomical — a revelation to no one, as we’ve all committed such blunders more than once. Though we might not call these incidents disasters, collectively that’s what they add up to. When the Storage Survey asked 475 IT leaders to name their key storage management challenges, 75 per cent of respondents cited the need for enhanced backup and disaster recovery capabilities. So it’s no surprise that the volume shadow copy technology in Windows Server 2003 is attracting lots of attention.

It’s a quick operation because the snapshot does not copy data; it just fixes a point in time. Changes made after that point — and only those changes — accumulate in a hidden volume. Clients, after installing a shell extension, see an extra Previous Versions tab in the properties dialogue box of a volume or folder mapped to the server. Each previous version enables viewing of, or restoration to, the former state of the volume or folder. The shadow volume, which defaults to 10 per cent of the volume it lives on, could become a popular kind of disk-based nearline storage. That’s a strategy that 65 per cent of survey respondents say they’ll implement in the next 12 months to simplify backup and accelerate recovery.

A fact not widely known is that Windows XP, both home and professional editions, contains an early version of the volume shadow copy service. “We like to stabilise file systems early in this group,” says David Golds, group program manager of the core file systems team at Microsoft.

By the time Server 2003 hits the streets, the volume shadow code will have been given a good burn-in. XP’s version of the shadow copy service is limited in several ways, though, according to Golds. The shadow holds only one level of undo, and it doesn’t persist across reboots. Currently it’s only used in conjunction with XP’s backup program, which can create a snapshot so it needn’t skip open files. As soon as users learn to depend on the “previous versions” shell extension for server volumes, they’ll expect it to work locally too, which should prove feasible. XP users may well wonder, though, if shadow copy restore might all along have been bringing back files deleted or overwritten on their own machines.

The elusive backup window for online business data is, of course, a key storage management headache. A third of the Storage Survey respondents cited the need to minimise that window. Snapshots neatly solve the problem, but the shadow technology also has far-reaching implications. Server 2003’s comprehensive backup API is just part of a new middleware architecture that enables applications and storage systems to work together intelligently. Just as ODBC and its successors delivered common ways to work with structured data, the three-way coordination among shadow copy writers, requestors, and providers promises to unify methods for handling low-level storage.

Writers and requestors engage in an intricate dance of cooperation. A requestor, such as a backup program, signals its intent to take a snapshot. A writer, such as a database, responds by temporarily suspending writes to disk, then resuming when the snapshot is complete. At that point, the requestor can read data that’s consistent and isn’t locked.

Writers also feed useful information to requestors. Windows 2003 includes a set of shadow copy writers: one for basic files and a series of others for specialised data stores, such as the IIS metabase, the system and COM+ registries, the event log, and more. Each specialised writer describes, in XML, the kinds of storage it manages. A database, for example, might distinguish between its data and log files.

Introducing a generalised scheme for this metadata is an idea whose time has come. It enables an application to supply guidance to any API-compliant requestor. As Golds points out, staging of live data, for querying or for software development and testing, will be a popular use of shadow copy. Data that intelligently describes itself will simplify the often painful task of migration and reassembly. For storage vendors, a common API will be a welcome respite from the combinatorial explosion of applications and data stores they must integrate. Golds says the API’s “very strong state machine and very clear error recovery” should not only radically simplify integration, but also make reliable results easier to achieve.

The shadow copy provider API works below the file system at the block level. That means when enterprise storage systems vendors deliver high-performance hardware-based implementations, the systems will work in the same way as Windows 2003’s built-in software-based provider.

“It’s the same tested code path through the file system,” Golds says. Because shadow copy and volume defragmentation both involve remapping of blocks, he adds, Server 2003 gains efficiency by making the two services aware of one another and able to work together.

Server 2003 includes other notable storage-related advances, too. The Distributed File System, in particular, is smarter, more scalable, and more flexible than earlier incarnations. But the shadow copy technology stands out as an innovation that will change how everyone, from the individual end user to the mightiest SAN vendor, thinks about storage.