Continuous data protection has been one of the buzzwords du jour ever since a few companies began using the term publicly at Storage Networking World last October. A few weeks ago, we discussed CDP in terms of its applicability to e-mail. Today, we will look at what CDP is and next time, we will consider where you might find CDP useful, and which vendor you may want to consider as a provider.
CDP has two general applications: business continuity and disaster recovery. When you look at how it is being marketed, you are likely to find some vendors positioning it as a back-up solution while others treat it as an archiving technology. In fact, in most cases, it is neither and in some instances, it is a bit of both. The important thing is that in most IT rooms CDP is likely to prove to be an excellent complement to both backups and archiving.
The way CDP works differs significantly in one important aspect: the way both backups and archiving work. With backups and archiving, changed data (or all the data in the case of a full backup) is written to another medium at some specified time - full backups every Saturday morning at 2 a.m., and incrementals every other morning at midnight, for example. With CDP, every change in the data is recorded irrespective of when the change takes place.
It is this point - lack of reliance on timed writes to the CDP data store - that primarily differentiates it from even the most granular application of snapshot technology. With snapshots the granularity may also be very fine, but the software still just capture changes at specified times. Note that this does not make snapshotting less efficient than CDP - as is the case with most things in life, we just have to make sure we apply the proper tool to the job at hand.
The point here is that with CDP, changed data is written with a very fine level of granularity: each point in time where a change occurs is captured and becomes a recovery point.
With CDP the changed blocks are saved to disk, typically some SATA nearline storage device. Think of it as a recovery server. Because of the granularity of the saves, production data can be rewound to any point in time. Because the data resides on disks, recovery times can be quite fast when compared to recovery from tape.
Recovery point objectives (RPO) can therefore be expected to be as precise as is necessary with CDP (in case you have forgotten, an RPO is that point in time to which data must be restored in order for transaction processing to resume). Because the data is written to disk CDP should help achieve recovery time objectives (RTO) that are quicker than recovering data from tape.
Several vendors are building in some added services based on their CDP recovery server. Everyone offers some particular intelligence about at least one application. A number of companies build in both local and remote replication capabilities. Remote replication might offer some value as part of your DR planning, but local replication will certainly be valuable to sites where volumes are replicated so development, test or analysis can be done using real-world data sets. Expect most vendors that don't presently offer replication to have it in their product pipeline.