Mike Stonebraker is something of a database legend. He was the main architect of the Ingres relational database, the Postgres object-relational database, and the Mariposa federated data system. He was founder and CTO of the Ingres, Illustra, and Cohera corporations as well as CTO at Informix and Required Technology. His latest project is StreamBase Systems, where he is founder and CTO. Stonebraker recently met with InfoWorld Editor at Large Paul Krill to discuss StreamBase and his previous projects.
What are StreamBase's goals?
We set about building a new piece of system software from the ground up that's very good at inhaling fire hoses of incoming data and doing fairly complicated processing on it.
How does your software work?
We read TCP/IP streams. We produce asynchronous messages to TCP/IP. The messages we produce, the customer has to write an application that consumes them. They give you an API, so you don't have to directly use TCP/IP (which means that) they can use it with an (app). In financial services, there's a dozen or so popular feed formats and we've written adapters for most of them to convert to our internal format. So in comes the market feed, (which) is an asynchronous message stream. One way to think about it is that we insist that it obey a database-style (scheme) so that we read binary messages off the wire and if they're not in our format, then there's a converter that converts them. We give you a workflow-oriented GUI with a bunch of primitives and you assemble an application by dragging and dropping all of our primitives onto a workspace and then you run the workspace.
So where is the data stored?
There's no requirement that you store the data. We are not a database system. We are a stream-processing engine that is good at doing processing on streams as they fly by. Not requiring storing the data makes a certain class of applications go way, way faster. ... Our implementation (in one particular application) runs 140,000 messages a second on a $1,500 PC. We tried the same application on one of the (commercial) RDBMSes. The best we could get it to was 900 messages a second.
What are your feelings about what's happened with Ingres, Postgres, and Informix?
Ingres suffered from benign neglect under the tutelage of Computer Associates. (Going open source) is a very reasonable strategy to try and get some traction for Ingres. I just wonder if it's too little, too late. Postgres has been an open source DBMS forever, and there's a grassroots effort that keeps Postgres going. It would be great if somebody with marketing muscle got behind Postgres, because I think it's a very high-functionality system that is very attractive if you need an extendible relational database.
What do you think the open source movement means for commercial software companies?
More power to them. I just chuckle that Microsoft is fighting Linux as hard as they can and so I think open source is attractive and I think that Linux is a fabulous model.