Oracle's Jeremy Burton talks XML

At its Oracle OpenWorld user conference in San Francisco this week, Oracle Corp. jumped into the Web services fray with both feet. In a bid to gain marketshare in the J2EE (Java 2, Enterprise Edition) application server world, Oracle has added support for Web services standards in Oracle9iAS, along with better clustering, wireless capabilities, and bolstered security. To make XML-based Web services integral to its flagship relational database, Oracle is detailing a project called XDB (XML database support), which is designed to provide high-performance XML document storage. Oracle's vice president of worldwide marketing Jeremy Burton spoke to InfoWorld editors Michael Vizard, Steve Gillmor, Martin LaMonica, Tom Sullivan, and Mark Leon about Oracle's views on emerging software technologies and architectures.

InfoWorld: There's a lot of talk about native XML storage in relational databases. What are the different levels of "XML support"?

Burton: [The first level is] if we look at the tags in the XML document and we can build an index based on those tags, and although the document is stored in kind of one big chunk, we can make the document searchable from standard SQL. The next level I guess would be is as the document goes in, you do a certain amount of parsing of the XML document and you apply structure to the data as it goes into the database. Now, the overhead is that you've got to do parsing so there's a performance hit there. But the upside is that you can add a bit more structure to that document, you don't store it in one piece. So if you just want to search abstracts, for example, or conclusions of documents that are tagged with XML, then you can do that.

So maybe you've got XML documents which have an introduction, body, and conclusion. You could store the introduction, body, and conclusion separately and search those three things independently. Or, if you want to take more of a hit when the document goes in, you can write some code to parse the entire document so you can do more granular searching. The big problem with doing a lot of parsing and adding structure is that you take a performance hit. Why? Because when you've got a relational database, a table is a different shape to a series of nested structures, which is what an XML document is.

So most (vendors), right now, they can take the document and store it in a database and make it opaque. I think ourselves and probably IBM can index the document as it goes in and make it searchable. And I guess there's much debate over how much structure you can add to a document and how searchable you can make it. I'd say we do some things that IBM can't do, but it's not a long-term defensible advantage.

So we set about kind of solving the problem for good, and one of the things that we have (is) called XDB, a project at Oracle. One of the underlying technologies that our new system is based on is objects. If you look at an XML document, it's a series of nested structures. If you look at objects in the database, it's a series of nested structures. And the XML document by and large is exactly the same shape as the object in the database. If you're then [storing] that object, you can do it in a very high-performance way -- there's no huge amount of parsing and flattening of structures and then reconstituting them later. The beauty of it as well from the way we've implemented our objects is that your application need not know nor care whether they're dealing with SQL, relational information, or structured XML-related information. And we also store the document in a very highly compressed way. We add structure to the document, we store it in the objects. And then the tags we pull out and store in a separate metadata directory. All the time we're building a directory of every tag available that your company deals with, and it often means we can store the information in a very highly compressed form.

InfoWorld: So the tags are essentially indexed and you're taking the effort to objectfy, for lack of a better phrase, the XML components of a document.

Burton: (We'll) make it very easy to define objects that map under the document types that you want to accept, No. 1. And No. 2, we'll store that into that XML document in a very highly compressed way. And I guess most importantly, your existing SQL tools will work against the XML data and your existing XML tools will work against it and they'll think it's XML data. And the bottom line is that whether you're talking SQL or XML, you neither know nor care.

InfoWorld: So I don't have to go suddenly master these new XML query languages that people are talking about either?

Burton: No. We'll have those as well if people want to use them, but you don't have to if you don't want to.

InfoWorld: Is there a lot of overhead in terms of adding the object capabilities to the XML document?

Burton: No, it's only the object database functionality. When we were competing with Illustra back in the mid-1990s, we built out this object technology. I don't know if you remember back in the mid-1990s, but there was huge hysteria around object databases.

And so we went on and we built this thing and then once we built it, what people actually found was that they didn't really want objects in the database, they wanted objects in the application. And, you know, hence C++ and now Java and EJB (Enterprise JavaBeans).

Call it by design or sheer quirk of fate, but XML happens to be the killer document type for an object database because it's a series of nested structures, and you really can make a database object exactly the same shape as an XML document so you can persist it very, very quickly. And that's practically zero performance overhead in persisting an XML document inside a database object.

InfoWorld: Are you saying that no one really had a use for the object stuff that you built back in the mid-1990s, but now you're discovering a use for it with XML?

Burton: Let me describe the uses of object databases. People who would model chemical formulae would use it, people who would model network topologies and telecommunications companies would use it. But those are pretty niche users. And it wasn't the kind of big mainstream feature that everyone was hyping it to be. And in a lot of ways it was a hammer looking for a nail, and it just so happened that XML happens to be probably the biggest nail on the planet right now. And we've kind of got this hammer that we built three or four years ago, which is basically embedded inside our relational engine. And it's been maturing for the last three or four years. And it almost does a perfect job managing these things.

InfoWorld: Today you're parsing XML pages, in the future you won't have to do that, right?

Burton: Yeah. This whole notion of parsing and how much effort you expend in parsing by and large disappears. You only parse something when you're looking for pieces of information that you want to put in a different place. If you've got a perfect match between store (and the) kind of document that you want to store, there's very little parsing to do.

InfoWorld: It sounds like you're taking the SQL access and making that dependent on the XML objects. Is that correct?

Burton: No. There're a number of things here. You've got an XML document, No. 1, which is a transient format for information. Information is only really stored in XML format while it's going between businesses. So when the XML document gets to point B, companies are not just kind of going to leave it in a file system. I mean there're a couple of things -- they want to store it in a reliable, management way. Or an application is going to take that XML document and do something with it. And so we take the XML document. ... The big question then is, how efficiently and how rapidly can you store that? So we have something called XDB, which will persist XML documents the same way as we take relation information and persist them in tables. And there's no performance overhead for doing so. So the issue of XML documents and database objects and SQL, they really are three different things. If you want to talk SQL, that's great. Your query tool won't even know that you're actually querying XML. If you've got some kind of application that wants to speak XML, great. They won't even know that they're talking to a relational database.

InfoWorld: To shift gears a little, when I keep reading all this stuff about what the app server vendors are doing with Web services, they make it sound like app servers are the center of the Web services universe. And I'm curious about what Oracle's view is on Web services as it relates to the database itself and what the relationship between the database, app servers, and Web services is exactly.

Burton: Well, Web services -- I guess it's interesting and it's a necessary standardization of how you call applications. But Web services, in and of itself, is one small part of a much bigger picture. I mean people still will spend 95 percent of their time writing business logic, and then they'll spend five percent of their time exposing certain parts of that business logic as Web services. And I think, by and large, people will use Web services for loosely coupled applications, such as the ones whereby the applications are located in different businesses. I think they have a huge role to play in terms of how businesses communicate, but the notion of, "Hey, I'm going to build an application; why don't I collect together 3,000 Web services and wire them up and it'll all work great," is I think a flawed notion.

To be honest, to actually go in and implement the Web services standards -- SOAP (Simple Object Access Prtocol), UDDI (Universal Description, Discovery, and Integration), and WSDL (Web Services Description Language) -- it's pretty easy. We really -- and this relates to the application server -- we support all of that stuff. The JDeveloper toolset allows you to productively build the application, and it really is not that hard. So it's actually much harder to go implement the Java or J2EE than it is to go implement the standards to support Web services.

InfoWorld: OK. So to you it's just a layer of integration software that you can layer in on top of the app server and connect a database to and we're off to the races?

Burton: Yeah. The developers will still spend 90 percent to 95 percent of their time writing Java code, building business logic. Once the business logic is kind of done, I think they'll then look at useful parts of the application to expose those Web services so the companies can interrogate the application or transact to the application. But it really is less than five percent of the job. And as I said, I think it's a very useful necessary standardization (of) the process of how applications communicate. But Web services are not kind of a panacea for application development. As I said, to think that people will build enterprise applications and assemble them from 3,000 loosely coupled parts is in the realms of fantasy.

Unlike a lot of the other Web services vendors, we actually build a business application. In fact, one of the things -- we've not actually talked about this yet, but will in our upcoming (Apps World) Conference in the new year -- we'll actually talk about how ... the e-business suite will start to expose key functionality as Web services. You're really talking about coarse grain components and Web services accessing those coarse grain components.

InfoWorld: Does that mean that large-scale ERP applications might be more modular in how they're packaged because you can expose subsets as Web services more easily?

Burton: Well, I think ERP applications are packaged in a pretty modular way right now, I just think the way that you interface with them will be done in a much more standard way.

What that'll help is ... the barrier to moving information between businesses will be much lower, which means that you can make processes between business way more efficient and hopefully save them money.

I mean Microsoft seems to be pitching this idea that Web services are a way for them to make money. I guess our approach, as an applications vendor, is that Web services will make business processes more efficient, and that will help customers save money. So they're a necessary component of business applications for process efficiency. But I don't know whether people are going to, in the business world, pay for the privilege of that. It's just expected functionality inside business applications. I mean we do a lot of this today with EDI (electronic data interchange), right? It's just that the barrier of entry to EDI is quite high, it's too expensive. A lot of the small companies are not going to invest in EDI infrastructure to move transactions. I think what Web services do, which is great, is really lower that barrier and allow the small guys to benefit from automation, as well as the big guys. And the business process is reduced for everyone.

InfoWorld: Do you think that this is also going to lower the barrier to entry for best-of-breed solutions?

Burton: Well I mean, again, it fundamentally doesn't solve the problem. Again, don't fall into the trap that Web services are a panacea for integration. Certainly they help you move a transaction from point A to point B. So, for example, if I've bought Epiphany's marketing system, I can move a lead which that marketing campaign generates into Oracle Sales Force Automation software. But what they don't give you is they still won't tell you how many sales your marketing campaign generated. You've still then got to go ahead and build a data warehouse to get any decent information out. That's a purely automated transactional flow, No. 1. And No. 2, they don't really deal with the shape of the data. So if you're moving customer information from a marketing system to a sales system, I mean Epiphany defined the customer different to the way Oracle did. If they have 10 fields that describe the customer and we have 20, then we've got 10 missing when that customer moves. So the data is still the wrong shape. And it does allow you to move a transaction from point A to point B, but it falls far short of solving the problem, which is that you still get no good information out.

InfoWorld: So Web services for 2002 is going to be about the applications?

Burton: Yeah, I mean consider this for a second. Right now the platform is the database and application server, right? That's what people go to. But who's to say that in two, three years' time, there won't be an enterprise application platform that automates all the business processes inside your company and allows people who are in verticals, for example, to build on top of it. ... And I think the way that other applications and companies will interface with it is probably through Web services.

Join the newsletter!

Error: Please check your email address.

More about IBM AustraliaMartin LaMonicaMicrosoftUDDI

Show Comments

Market Place