Analysis: Databases get a grip on XML
The next iteration of the SQL standard was supposed to arrive in 2003. But SQL standardisation has always been a glacially slow process, so nobody should be surprised that SQL:2003 — now known as SQL:200n — isn’t ready yet. Even so, 2003 was a year in which XML-oriented data management, one of the areas addressed by the forthcoming standard, showed up on more and more developers’ radar screens.
The steady growth of Web services, with an increasing focus on SOAP payloads — XML documents defined by the rules of XML Schema — was one contributing factor. The advent of Microsoft Office 2003 was another. Now that Microsoft’s productivity applications speak XML natively, there will be growing demand for XML-aware databases that can catalogue, search, and transform the XML representations of spreadsheets, memos, and forms.
Oracle’s XML DB, part of Oracle9i Release 2 (and now also 10g), anticipated the need. Available since mid-2002, the technology supports XML as a native type in the database and can map between W3C XML Schema and the SQL:1999 object model. XML content can be handled in more or less structured ways according to the needs of the application. At one end of that continuum, a database trigger can prevent insertion of an XML fragment that violates a referential integrity constraint. At the other end, a collection of XML documents can look like a hierarchical file system with which users navigate and directly interact.
IBM’s counterpart to Oracle’s XML DB, the DB2 XML Extender, received a few minor updates in 2003. But the big story was the DB2 Information Integrator, which takes a very different approach to that of Oracle. Rather than enabling many flavours of data to coexist happily in big central databases, IBM says it wants to enable many data flavours to federate smoothly across a diversity of databases. The DB2 Information Integrator mainly targets the SQL-oriented developer, but it supports XML (and freetext) search, maps between relational and XML schema, invokes XML Web services from SQL expressions, and can produce and transform XML result sets.
Microsoft’s long-awaited “Yukon” edition of SQL Server didn’t ship in 2003, but the first beta became available to some developers to kick the tyres. Some of the capabilities now expected to arrive in 2004 — a native XML data type called XQuery support — lag behind mainstream competition. A key differentiator for Yukon will be its use of the .Net Framework as a foundation for database programming.
While the spotlight shone on the heavyweight contenders, a couple of agile innovators made noteworthy advances in 2003. OpenLink Software’s Virtuoso 3.0, which we reviewed in March, stole thunder from all three major players. Like Oracle, it offers a WebDAV-accessible XML repository. Like DB2 Information Integrator, it functions as database middleware that can perform federated “joins” across SQL and XML sources. And like the forthcoming Yukon, it embeds the .Net CLR (Common Language Runtime), or in the case of Linux, Novell/Ximian’s Mono.
Also in 2003, Sleepycat Software released Berkeley DB XML, an XML-enhanced version of its popular embedded database. Available under dual open source/commercial licensing, Sleepycat’s core product has long been favoured for a variety of high-performance, transactional, but nonrelational applications. With XML support, it became interesting to a variety of players in the XML Web services arena. As increasing volumes of XML messages flow through service pipelines, there’s a need for specialised, high-performance data management tools. Berkeley DB XML exemplifies this emerging category.
A trend to watch in 2004 is growing support for XQuery. Current methods of querying XML data are rooted in a tradition of document-oriented XML; XQuery, in contrast, focuses on data-oriented XML. Given that SQL took decades to evolve, XML-oriented data management will probably take a while to gather steam.
But the wheels are turning.
Story by Jon Udell