Technology: XHTML


Extensible Hypertext Markup Language (XHTML) is a transition language that combines HTML and XML. Designed to replace HTML Version 4.0, XHTML is recommended by the World Wide Web Consortium. XHTML Version 1.0, also known as HTML 4.01, is a reformulation of HTML 4.0 as an XML 1.0 application. It includes three Document Type Definitions. Future versions of XHTML will allow modular HTML, which will suit a wide variety of devices, and document profiles, which will ensure interoperability.

Introducing XHTML

As shown below, HTML 4.0 and XHTML can be used to achieve the same result. But when using HTML, the tags used for presentation - how a font looks or where a word is on-screen - must be repeated for every document. If style sheets are used with HTML, then a single style sheet can define the look of an entire site. XHTML goes a step further. By integrating XML, Document Type Definitions (DTD) allow you to create your own tags. Style sheets can then be used to define how every instance of that tag should look in a browser.

The days in which you could learn HTML in a short time and craft your own Web page using a text editor are numbered. On the bright side, handling any Web site that's bigger than 10 or 20 pages is about to get a lot easier.

"The big win is that you no longer have to force data structures in HTML," said Steven Pemberton, chairman of the World Wide Web Consortium's (W3C) HTML Working Group.

W3C works with businesses and governments to create Web standards. But browser makers historically haven't waited for W3C specifications and have either implemented standards before they're fully defined or implemented them incorrectly. XHTML is the W3C's attempt to redraw lines blurred by browser makers.

HTML was created to be a structural language - nothing more. But browser makers quickly began pushing the envelope, adding presentation capabilities. That often involved nonstandard tags or tricky shortcuts such as using tables to lay out a page, which could slow page-loading times drastically and complicate Web site content management.

In the XHTML specification, the language is once again only structural. Tags are used to mark up headings, paragraphs, lists, hypertext links and other structural parts of the document. Style sheets, on the other hand, handle issues of presentation: fonts, colours and margins. The intent is to simplify sites, decrease download times and more easily present the same content to multiple types of devices.

Easy does it

XHTML works by separating content from style. Content creators craft HTML; designers create style sheets. This simplifies the Web server's job, since site visitors need to download a style sheet only once. Every subsequent page that refers to that style sheet downloads much more quickly. Changing the look of the site is simplified because you have to change only a few style sheets, not thousands of HTML pages. Web server processing power is saved and less material is transmitted, since HTML documents are free of font tags and colour specifications.

A future version of XHTML will introduce modules. Many devices, such as mobile phones, would need only a subset of XHTML because modules would automatically filter the XHTML to include only what the device needs.

XHTML will ultimately replace proprietary Web file formats such as Portable Document Format files and Flash and other multimedia formats. For example, the Synchronised Multimedia Integration Language (SMIL Version 1.0, pronounced ‘smile') lets designers describe the temporal behaviour and layout of a Web page as well as associate hyperlinks with objects. Along with the Scalable Vector Graphics XML standard, designers could create animations or even augment television feeds using XML.


When using HTML without style sheets, extensive tags that describe how every element is to look must be used.

titleDocument summaryDocument content.XHTMLUsing XHTML, all content can be written into the HTML document, the DTD can define what each of the tags means and a style sheet can define how each tag looks. Though XHTML might seem like more work up front, a site with thousands of pages might need only one DTD and a few style sheets. Any time a design change is needed, only those few style sheets would have to be updated.



Document summary

Document content.


head - text-transform: uppercase; font-family: "Verdana", sans-serif; colour=blue

summary-colour: green; font-weight: bold; font-style:italic; size: 80%

body-colour: black; font-family: "Georgia", serif; size: 80%


!ELEMENT HYPERLIB- -(HEAD, SUMMARY, BODY)>>Version road mapXHTML 1.0, released in January, is considered the transitional version from HTML 4.0 to XHTML. There are very few differences between it and HTML 4.0. In this release, JavaScript still must be hidden in comment tags. There is some new functionality from XML integration. For instance, developers can use MathML, an XML application for displaying mathematical notation.

XHTML 1.1 is scheduled to be released this month . This version still allows for tables and it introduces modular XML, meaning developers can create their own XML languages or use pieces of already-established XML languages referring to the Document Type Definition (DTD). DTDs describe what the variables in an XML document refer to, so a browser will know, for instance, that contains a phone number. By using DTDs, companies across industries can agree about how to handle text.

XHTML 2.0 is due next winter. It will include an extensible event mechanism. Current HTML and XHMTL can react only to mouse-over or mouse-click events. The new event mechanism will handle events specific to the desktop or devices, such as telephones and Web TV.

With XHTML 2.0, content can be coded once, and browsers on the devices can refer to the DTD to know which portions of the document to exclude and which to include. Then each device can display the content most appropriately. Version 2.0 will also improve form handling. Much of the information put into forms can be validated on the client side, and users can download multiple page forms, save them in progress, then upload them when done.

XHTML 2.0 will introduce document profiles, which specify the syntax and semantics of a set of documents. By sticking to the profile, which might specify acceptable image formats, levels of scripting and style sheet support, content creators could assure interoperability. A document profile could cover audiences as broad as the users of a certain browser version or as small as a select group of specialists.

- Mathew Schwartz

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about W3CWorld Wide Web Consortium

Show Comments