Databases: Powerful engines set on cruise control

All those out-of-work DBAs out there might think back on the history of database technology with some fondness. For back in the good old days, databases were complex, inflexible monsters that required armies of attendants.

These days, databases are scalable, distributed, flexible and easy to manage — and the trends are for more of the same. Oracle 10i slated for later this year is going to be all about clustering and simplified management.

One consolation for DBAs is that they are far fewer in number than the army of filing clerks they displaced almost 50 years ago — before Computerworld was even a sparkle in its founder’s eye.

Computers were originally designed as computing machines — designed to do things like crack the secret Russian codes and prove or disprove modern physics theory.

But business was being buried under a mountain of paper. Bank of America launched ERMA — Electronic Record Method of Accounting — in 1959, primarily because it could not find enough clerks to maintain its filing. Remember, there was close to zero unemployment in the 1950s.

Early databases appeared from the late 50s helped along by Fortran in 57 and Cobol in 60.

Not that they were recognisable as databases by today’s standards. Data was coded in with the applications, and it was all sequential. Tape was the latest storage device.

But the 1970s ushered in a new era — microprocessors, disk storage, Ethernet networking and relational databases.

The father of the relational database was IBM’s Dr Edgar (Ted) Codd who published “A Relational Model of Data for Large Shared Data Banks” in 1970.

Codd died on April 18 this year aged 79 and has since received tributes from around the world.

“Ted’s basic idea was that relationships between data items should be based on the item’s values, and not on separately specified linking or nesting. This notion greatly simplified the specification of queries and allowed unprecedented flexibility to exploit existing data sets in new ways,” writes Don Chamberlain, inventor of SQL, on the IBM site.

At the time Dr Codd published his seminal work, the latest technology was hierarchical databases. These were a step up from sequential files in that they enabled ‘pointers’ to be embedded in files that enable that record to take on a single parent or a single child.

Thus, a bank could keep a record of an account, and embed a pointer signalling the owner — whose contact details might be kept separately.

However, what if it was a joint account or if someone had more than one account?

IBM’s IMS was the market leader, and is still in wide use today in the banking industry — primarily because it worked and there has been no need to fix it.

In fact, hierarchical systems worked so well, especially for IBM, that no one bothered commercialising Codd’s work until Larry Ellison had a read almost 10 years later.

In 1977, one year before the birth of Computerworld Australia, Ellison’s then-fledgling Oracle released the world’s first relational database followed hard by Relational Technologies’ Ingres.

Their success spurred IBM into action with System/R in 1979.

Relational database theory was originally tested on a 6MB file and it is testament to the foresight of Codd that his theory still stacks up in the age of the terabyte database (CERN is talking about an Exabyte database in 2005, that’s 1018).

Today Gartner estimates the industry will be worth $US9 billion in 2003 with Oracle, IBM and Microsoft leading a now depleted pack.

Gartner’s database expert is Kevin Strange, VP of research, who is a big fan of Codd’s work.

“The durability of the relational model lies in the separation of the application from the data,” Strange said. “This enables shared data across multiple applications.”

Bruce Allen, senior database consultant with Grapevine Information Technology, agrees.

“That we are using the same basic technology now with data warehouses running to the terabytes means the original concept was very sound,” Allen said.

If the core database technology industry is mature - what radical new changes are expected in the next few years?

Strange and Allen both think nothing radical is in store. Instead, two trends will continue to gather pace.

Cluster technology and grid computing will mean ever more powerful database engines, and SANs and distributed computing mean ever larger physical databases.

“A decade ago a 1GB database was huge,” Allen said. “Now there is an explosion in size. Sharing the database across multiple machines using cluster technology will only increase the rate of growth.”

Second, database management will simplify still further.

“Automated management tools are increasing the viability of interoperability and reducing management workloads,” Allen said. “The database is becoming a commodity.”

Strange agrees. “If my son was leaving school, I wouldn’t recommend he got into database administration,” Strange said.

Application development might be a different story. Strange can see an explosion in application development as data becomes even more accessible.

The other area of contention is the emergence of a ‘Super SQL’ — some kind of 5GL that will complete the interoperability process begun between databases.

An early contender has been XML, but Strange sees limited potential. Two XML database companies have gone broke this year, he said, and really its use is limited to integrating applications, not databases.

The exception is document management. “XML allows you to add structure to unstructured text documents,” Strange, said. “Now you can analyse.”

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about CERNExabyteGartnerIBM AustraliaIMSMicrosoftOracle

Show Comments