The emergence of data integration software is giving companies the ability to move back-office, enterprise resource planning information to the Internet.
Data integration products provides software "caching", or data staging, between a company's Internet computers and back-office systems from companies such as SAP, Oracle, Sybase and PeopleSoft.
Data integration provides a mirror image of back-office information that is stored on a company's main computers. When an Internet customer needs to check on the status of an order, the enquiry is directed to the data integration software. The company's main computers do not always need to be accessed. Data integration software has enough intelligence to know when to synchronise with the main computers to keep data up to date.
Integrating enterprise resource planning (ERP) data for e-commerce applications is done through combined data staging with direct access to ERP data. It involves using a data server and data caches. Data integration software intelligently blends direct real-time and batch data-access methods for extracting data from an ERP system.
Data progresses from one or more sources to one or more target tables, or message types (such as XML). The steps of the data movement involve identifying the sources from which data should be extracted, transformations the data should undergo, and where to send the data. Users specify data mappings and transformations through a graphical user interface.
User-defined processes control movement of each block of data and define interdependencies between such movements. For example, if one target table depends on values from other target tables, processes are used to specify the order in which a data server should run individual data movements that fill the tables.
Movements can be designed to run in batch or real-time mode, and are created and managed by administrators to control data movement between ERP, e-commerce, customer relationship management, supply-chain management, and legacy and messaging applications.
Data movement uses distributed query optimisation, multithreading, in-memory caching, in-memory data transformations, and parallel pipe lining to deliver high data throughput and scalability. To manage the extraction process and perform batch data extraction from SAP software, for example, optimised ABAP code (SAP's proprietary programming language) is used, obviating the need to develop and maintain customised ABAP code.
Key to the data integration architecture is a data cache that includes a target schema, source-to-target mappings, and transformations that handle change-data capture, hierarchy extraction, error recovery and security. In addition, a data cache contains predefined data extraction jobs that automatically populate the cache with a company's back-office and/or data warehouse.
A cache serves as a single point of integration for enterprise and e-commerce data, minimising the need for direct access to back-office systems and for complex real-time integration. The cache offloads numerous, unnecessary requests for data from back-office systems, thus letting e-commerce companies scale to a greater number of users, while letting back-office systems do what they are designed to do.
* Carol Mills Baldwin is CEO of Acta Inc in California