Continuous-capture data protection: Questions to ask before you talk to the vendors

Continuous-capture, or journaling, data protection is a new backup/recovery paradigm. It offers a level of data protection and availability that was unimagined a few years ago except by the handful of dreamers who were making prototypes of it.

If you haven't encountered this concept yet, here it is in a nutshell: Changes to a primary disk volume are detected as they happen, and those changes are immediately replicated to a second device. Replication is typically asynchronous, so the data protection process doesn't slow down payload processes such as your e-mail or CRM systems. The trick is that the changes aren't merely applied to the second device as a mirror copy. Instead, they are journaled, stored sequentially, so that every version that any stored object passed through can be recovered. Since the other device is a disk array, recovery can be very fast. Many types of recovery are nearly instantaneous.

The best feature of the continuous-capture paradigm may be its reliability rather than its fine granularity of backup resolution or its speed of recovery. Unlike notoriously finicky networked backup systems, which tend to be vulnerable to every sort of human error and every variation in the schedules and behavior of the servers they are meant to protect, journaling systems are hard to break. Data capture happens all the time, and because new changes don't overwrite the old ones (at least not for a long time), even if something does happen to the capture process, every change right up to the point of failure is already safe. Once the capture system is installed, the process runs without human intervention until a recovery is needed, so there's little opportunity for human error. Operational costs can be really low because of this as well.

Why wasn't backup always like this? The base technologies that continuous-capture systems employ have been understood for a long time, but in order to make deployment practical, a number of supporting technologies had to mature first. Huge low-cost disk arrays are the most obvious and arguably the main enabler. These big, cheap disks are certainly the trigger that led to six vendors announcing products in 2003. But other trends, such as affordable storage networking, disk virtualization and reliable open driver architectures, all contributed too. Regardless of how it evolved, continuous data protection is with us now.

As when any new technology makes its debut, potential purchasers may be left scratching their heads. Last year's service-level agreements for backup and recovery were probably based on the presumption that only so much was possible -- and "so much" generally meant that the maximum backup frequency was daily and the minimum recovery time was measured in hours. Replication-based technologies that promised some form of rapid recovery were reserved for the most critical of applications, largely because of their high cost.

All those assumptions are out the window now. In short, you need to recalibrate your yardsticks, because data protection service levels you thought cost $1 million per terabyte can now be matched, or nearly so, for much less. Here are five questions to consider before buying a data protection technology:

1. What's my definition of a "critical" system?

Critical systems are the ones that merit the highest level of data protection. Many companies define them as systems that cause revenue-generating activity to stop when they're down. But in practice, the number of systems that are actually treated as critical is often severely limited by the cost of providing high data availability. These new journaling systems will provide more protection and nearly as little downtime as classic high-end mirroring with six or more replicas, at a much lower cost. The mere risk of productivity loss may be enough to justify investing in continuous capture.

2. Where do my data-loss risks come from?

Journaling protects data in ways that neither backup nor mirroring can. It reduces vulnerability to viruses and worms, because as soon as an invasion is detected, you can roll right back to just before the point of corruption. The journal will even help you pinpoint where and how the invasion occurred. The same applies to other sources of data corruption, such as human error or a volatile software environment. If hardware failure or site disaster is your biggest risk, you won't want to give up on "true" mirroring as the first line of defense.

3. How would my systems administrators spend their time?

In most data centers, the single biggest administrative cost is the care and feeding of backup. Continuous capture's claims on reducing these costs aren't field-proven yet, but the qualitative arguments are strong. Two aspects of this to consider are how continuous capture will reduce operator intervention in the backup process and how it will simplify the backup network.

4. Who's in charge of recovery?

Another source of high administrative costs in some businesses is the amount of time operators spend recovering individual files for end users. Elsewhere, such recovery requests are routinely refused unless the impact of the loss is very high. Some continuous-capture models make access to prior versions of data so simple and secure that anyone who can use a desktop computer can recover his own data. Not all enterprises want to be this open, but don't overlook it.

5. Where's that old wish list?

A data journal has many applications beyond data protection. Look for new forensic data applications to emerge that use continuous histories to diagnose system problems, hacker techniques and more. Nearer term, journals that can present as virtual point-in-time replicas can enable secondary processes like tape archiving, data extraction, data integrity checking, and reporting to run in parallel with primary transaction processing, without affecting the performance or availability of the primary process.

One more question: When was the last time you read a thousand-word article on backup that didn't mention the "backup window"? That's something that would be easy to get used to.

- Marcia Reid Martin is a senior advisory software engineer and chief EchoView architect at StorageTek in Louisville, Colo.

Join the newsletter!

Error: Please check your email address.

More about Critical SystemsStorageTek

Show Comments

Market Place