Computerworld
Too much ETL equals poor data management
Ken Karacsony  28 September, 2005 15:33

To put it bluntly, performing extensive extract, transform and load (ETL) processes is a symptom of poorly managed data and a fundamental lack of a cogently developed data strategy. When data is managed correctly as an enterprise asset, then ETL is significantly reduced and in many cases completely eradicated. Now, I realize that this is a provocative statement, but in my estimation, ETL is overused within the IT community, leading to inefficiency and unnecessary expense.

ETL gained popularity as companies began to outgrow antiquated systems. As functionality was moved from legacy systems to open systems architectures, ETL played an indispensable role in moving the data. Unfortunately, many companies failed to completely retire their outdated systems; rather than performing ETL as a one-time initial load event, ETL evolved into a part of daily operations.

This problem was further exacerbated as companies developed systems within functional silos. The application-specific approach, in which the database is designed to accommodate the needs of an individual group or department, took root. According to this methodology, every new system requires its own database. As a result, data is copied from system to system. Hence, ETL is now firmly ensconced in nearly every company and is an integral part of IT operations.

Consider a simplified example of typical ETL activities, in which data is propagated from the product system into warranty, finance, purchasing and sales systems, and eventually into the data warehouse. Not only is the data extracted and loaded, but it must also be transformed because the data structures between systems are completely disparate.

This problem is compounded when the data is propagated back to the source system in order to synchronize the data that's no longer synchronized between systems precisely because it was copied. The inevitable result is poor data quality and high maintenance costs.

If the product database in this example changes -- for instance, if a new field or table is added -- it will be necessary to change all of the maps that move data from the source to a target. One minor structural change in the source can create a maintenance nightmare in the ETL maps and target databases -- a lot of IT expense with no value added.

The only legitimate ETL activity in this example is the data warehouse interface. All others are unnecessary and incur a tremendous cost. According to Larry English, president of Information Impact International and a leading expert in information quality, "The IS staff is busy maintaining, on average, a tenfold [increase in] redundant databases and the redundant applications or interface programs that re-create or move the data." Ouch.

The solution to the problem is really quite simple, but it requires a new way of thinking. The answer is a single, sharable, enterprise-strength database for each major subject area. The key is to design each of these databases to meet the needs of all of the information consumers.

With the advent of new technology, there is absolutely no need to propagate data from system to system, thereby creating unnecessary (and expensive) redundancy. Applications can now access data via shared services from different databases.

Information is a corporate asset and doesn't belong to one group or department. Therefore, databases must be designed and implemented with both producer and consumer in mind. The entire company must be involved in defining entities and attributes and modelling relationships.

To support this new way of thinking, applications can't be designed in a vacuum. IT must re-engineer applications to use common, shared databases centred on the significant subject areas. The database, and not the application, is the centre of the universe.

There's a simple rule to determine whether a company has an effective data strategy: count the number of ETL processes and redundant databases. The higher the number, the less likely it is that the company manages its data correctly.

Next, ask, "Why create another database with similar data?" Often the answer is that the data is not in the right format or doesn't support the new application, or the source is incomplete and inaccurate. And then I know that the company is wasting money and resources and performing unnecessary, costly maintenance and development rather than producing value-added services based on sound data management techniques.

Computerworld Buyer's Guide - Vendors Matched to this Article
More about VIA, ADVENT, TenFold

Comments

Post new comment

Login or register to link comments to your user profile, or you may also post a comment without being logged in.
The content of this field is kept private and will not be shown publicly.
Enter the fully qualified URL, eg. http://www.example.com/
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

Add to Google
Computerworld Buyer's Guide - Vendors Matched to this Article
Zones
Zone logoZones provide focussed content from Computerworld and leading technology partners.
Newsletter Subscription
Newsletter Subscription
Sign up for our Computerworld newsletters!
Syndicate content
 

Computerworld Webinar

Thursday, June 11th, 2009
10:30am EST (Sydney, Australia)
Screening at your PC

Computerworld is hosting a 30 minute live webinar to help you to learn how unified communications can save you money, foster innovation and business agility by making it easier for people to find, reach and collaborate with one another.

Register Now

Computerworld Community Comments
Whitepaper

The business justification for data security

In the information security world we face two major types of threats: "noisy" threats which directly interfere with our ability to do business and "quiet" threats which cause real damage, but don't necessarily prevent people from doing their jobs. Read on to discover how to combat both types of threats and to justify the use of data security within your business.

Enterprise IT Buyer's Guide
Find Technology Vendors Fast
 
Find vendors by name | Find by category
Sponsored Links
 
Send Us E-mail | Privacy Policy
Features List | Media Kit | Advertising | Contact Us

Copyright 2009 IDG Communications. ABN 14 001 592 650. All rights reserved.
Reproduction in whole or in part in any form or medium without express written permission of IDG Communications is prohibited.