At this point it is time to identify the business value of each type of data object, which means understanding three things: what kind of data you are dealing with, who will be using it and what its keywords are. This stage is preliminary to doing the classifications. The strategy you deliver to your various organizations must emphasize mitigating risks in three areas: data security, data availability and data integrity. In the new nomenclature of IT, the way you go about this will be your policies.
First, create classification rules. This means assigning value based on criteria such as business importance, availability and performance requirements, and legal/regulatory/corporate governance rules. By doing this, you will be creating guidance that indicates where data is to be stored. The list, in the chart below, shows one way to go about this. The classification scheme will be your own, reflecting local needs, but whatever terminology you use, it will be useful to identify at least three classes of data.
Second, build retention policies. Establish linkage between each class of data and the hardware and services they require, including rules governing when each data class should be moved to the next storage tier. At the very least, this means determining correct storage tiers, security levels, degree of data protection and migration strategies. For example, files answerable to Sarbanes-Oxley will typically require disk-based backup for rapid retrieval; for others, backup to tape is fine. Don't forget the lowest level of the hierarchy, the offline archives. You will find that in many cases a data class may really refer to only a single data set, and that archiving for one class of data may not be at all suitable for another class.
Aligning data classes with the data life cycle predefines the events that will drive ILM policies, and plays a key role in enabling both automated resource provisioning and automated data migration. With few exceptions, this process will be the same at all sites - classification simply means aligning your stakeholders' business requirements to the IT infrastructure. Create a formal procedure that identifies each group's requirements, how it values its data and how satisfactory current performance levels are.
Talk is cheap, but working with vendors rarely stays that way. Most IT folks have been doing this for as long as they can remember, and the rules haven't changed just because they pertain to ILM products. In some cases, one vendor can supply all your needs. In most cases, however, that will not be true. Concentrate on vendors whose solutions can incorporate your legacy systems, and whose offerings include data classification capabilities. Don't be afraid to talk to consultants, read what the analysts have to say and compare notes with colleagues at other sites.
When you engage with the vendors, make sure to understand their products' capabilities in each of the following areas:
-- Ability to tag files as compliant for each required regulation.
-- Data classification.
-- Data deduplication.
-- Disaster recovery and business continuity.
-- Discovery of compliance-answerable files across Windows, Linux, Unix and any other operating systems you may have.
-- Fully automated file migration based on locally set migration policies.
-- Integration with backup, recovery and archiving solutions already on-site.
-- Searching (both tag-based and other metadata-based).
-- Security (access control, identity management and encryption).
-- Security (antivirus).
-- Set policies to move files to appropriate storage devices (content-addressed storage, WORM tape).
-- Finding and tagging outdated, unused and unwanted files for demotion to a lower storage tier.
-- Tracking access to and lineage of objects through their life cycle.