Markup languages

In 1969, three IBM researchers created GML, a formatting language for document publishing. Understood to mean Generalized Markup Language, the letters also happened to be the initials of its creators: Charles Goldfarb, Edward Mosher and Raymond Lorie.

GML allowed text editing and formatting, and it enabled information-retrieval subsystems to share documents. Instead of a simple tagging scheme, however, GML introduced the concept of a formally defined document type containing an explicit hierarchy of structured elements.

Major portions of GML were implemented in mainframe publishing systems, and the language achieved substantial industry acceptance. IBM adopted GML and produces over 90 percent of its documents with it.

GML was expanded with additional concepts, such as short references, link processes and concurrent document types, into Standard Generalized Markup Language. SGML made inroads in the publishing world, especially at the U.S. Government Printing Office, and it became an international standard in 1986.

Still, SGML was largely unknown until 1990, when Tim Berners-Lee, inventor of the World Wide Web, created Hypertext Markup Language as a subset of SGML. Soon, every type of document and data was being littered with tags at the beginning and end of text elements like this: and. Then Extensible Markup Language (XML) came along in the late 1990s, and the IT world hasn't been the same since.

In fact, it seems that hardly a day goes by without a new markup language being announced or described. Indeed, Computerworld has published separate QuickStudies on 10 markup languages, and that just scratches the surface. A Google search on "markup language" returns more than 6 million pages.

Thus we present this shorthand guide to current markup languages. It certainly doesn't cover them all, but it does give an idea of the flexibility and power of the concept and how it is being used. Most are simple extensions of XML or document type definitions specialized for a particular area of interest, but some are quite complex.

The Languages

- Business Process Execution Language: BPEL is designed to run a series of Web-based transactions and/or characterize interfaces that are needed to complete Web-based transactions. It's used for modeling business processes, with specifications for transactions and compensating transactions, data flow, messages and scheduled events, business rules, security roles, and exceptions.

- Cell Markup Language: CellML stores and exchanges computer-based mathematical models, allowing scientists to share models even if they use different model-building software. It also enables them to reuse components from one model in another, thus accelerating model building. CellML includes mathematics and metadata by leveraging existing languages, including MathML.

- Chemical Markup Language: CML is a new approach to managing molecular information that uses recently developed Internet tools such as XML and Java. Based strictly on SGML, it's capable of holding extremely complex information structures and can therefore act as an interchange mechanism or an archiving tool. It interfaces easily with modern database architectures, such as relational or object-oriented. Most important, a large amount of generic XML software to process and transform it is already available from the community.

- DARPA Agent Markup Language: XML has a limited ability to describe the relationships between objects. DAML extends XML by using ontologies -- explicit formal specifications of how to represent the objects, concepts and other entities in a particular area of interest, along with the relationships among them.

- Dynamic Markup Language: DML is an XML-based language designed specifically for object-based graphics construction and the development of user interfaces. Similar to HTML, it includes extensions that support calculations, argument-passing and variable storage.

- Directory Services Markup Language: DSML defines the data content and structure of a directory and maintains it on distributed directories. DSML gives developers a simple and convenient way to implement XML-based applications on the Internet. Such support is crucial to e-commerce applications.

- Financial Products Markup Language: FPML is a business information exchange standard for electronic trading and processing of financial derivatives instruments. It establishes a protocol for sharing information on and dealing in derivatives and structured products.

- Hypertext Markup Language: The backbone of the Web, HTML is based on a dialect of GML that was previously used at CERN. Its primary innovation was to allow simple hypertext links from one document to another.

- Human Markup Language: HML is part of an effort to provide a framework for the overall human communication process, including areas and concepts such as thought, emotions, behaviors, kinesics, beliefs and facial expressions, through graphical or text-based representation. It goes way beyond emoticons!

- Materials Markup Language: MatML was developed for the interchange of materials information.

- Multimedia Retrieval Markup Language: MRML unifies access to multimedia retrieval and management software components to extend their capabilities.

- Physical Markup Language: PML is a simple, general language for describing physical objects and environments for industrial, commercial and consumer applications. PML allows modularity and flexibility so it can be used in monitoring and controlling a physical environment. Applications include inventory tracking, automatic transactions, supply chain management, machine control and object-to-object communication.

- Security Assertion Markup Language: SAML is an XML-based framework for communicating user authentication, entitlement and attribute information. It allows businesses to make assertions regarding the identity, attributes and entitlements of a subject (often a human user) to other entities, such as a partner company or another enterprise application.

- Services Provisioning Markup Language: SPM is a framework for exchanging user, resource and service provisioning information between applications and organizations.

- Speech Synthesis Markup Language: SSML assists in the generation of synthetic speech in Web software and other applications by providing a standard way to control speech aspects such as pronunciation, volume, pitch and rate across different platforms.

- User Interface Markup Language: UIML permits the creation of user interfaces for any device, target language and operating system on a device. It describes three things: the appearance of a UI, user interaction with the UI and how the UI is connected to the application logic.

- Voice Extensible Markup Language: Voice-activated applications are increasingly common, and VoiceXML specifies common features to help ensure portability between platforms.

- Wireless Markup Language: WML describes content and formats for presenting data on limited-bandwidth devices such as cellular phones and pagers. Rather than attempting to deliver the same Web page content you would see on a PC, WML presents mainly text-based information optimized for mobile devices.

- Extensible Access Control Markup Language: XACML is an XML-based schema that was designed for creating policies and automating their use to control access to disparate devices and applications on a network.

- Extensible Markup Language: XML was created to combine the extensibility of SGML with the simplicity and wide support of HTML. Basically a subset of SGML, it's simpler and easier to implement and allows most of SGML's capabilities. XML was approved as a standard by the World Wide Web Consortium in 1998.

The Nonmarkup MLs

Not every language or acronym ending in "ML" represents a markup language. Here are the best-known exceptions.

- ML: "ML" originally stood for "metalanguage," but it's a general-purpose programming language designed for large projects. There are two main dialects in use today: Standard ML (SML; see, a mathematically defined version of the language formulated in part by some of the original language developers; and Objective Caml (OCaml; see, an offshoot version from the original ML to which features are added at will without being defined in a standard. Other related languages include Extended ML (EML; see and Alice ML (

ML and its variants are purely functional languages and don't allow any assignment to storage. These functional languages are difficult to program in, but their programs are much more amenable to formal analysis and proofs of correctness.

- Unified Modeling Language: UML is a standard notation for modeling real-world objects as part of developing an object-oriented design methodology. UML is used for modeling application structure, behavior and architecture, along with business processes and data structures. Vendors of many computer-aided software engineering products support the language. UML was developed from methodologies that also describe the processes in developing and using the model. (

- YAML Ain't Markup Language: YAML is an international collaboration to make a data-serialization language that is both readable by humans and computationally powerful. (

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about ACTCERNData Technology SolutionsDCSExtensibilityGoogleIBM AustraliaMITWorld Wide Web Consortium

Show Comments