The Object Management Group (OMG) is in the process of voting on whether to approve Unified Modeling Language (UML) 2.0, the latest version of this specification for model-driven development. InfoWorld Editor at Large Paul Krill talked about the subject with Brian Selic, IBM Distinguished Engineer and co-chairman of the OMG task force on UML 2.0.
Why should we be excited about UML 2.0?
Well, UML 2.0 is actually the first major revision of UML, and I guess the big reason to be excited is that it's based on this idea of model-driven development. It's essentially a standard that's been revamped to better support that whole concept of models taking primacy in software development.
OK, so when you say model-driven development, why is this better than the way things have been done?
There are many different flavors of model-driven development, but all are based on this common theme, where it's like parallel themes. One is working at a higher level of abstraction so that you're closer to the problem domain, and modeling languages give you that ability to work above the technology, so to speak. And the other aspects of it that is significantly better is the automation side of it and the fact that you start using computers to help bridge the gap from a (mere) level of abstraction down to the implementation.
How prevalent is model-driven development these days?
Well, I'd say it's certainly growing. It's strongly somewhere between 10 to 15 percent right now of the development market, but that's just my estimate. Of course you can imagine I would see a lot of customers that are interested in model-driven development. But (this estimate is also from) talking to our field folks and so on.
What is the status of the vote on UML 2.0?
We've finalized the spec, which is a phase that was necessary because the spec is originally (published) but was never implemented, so it's necessary to go through this finalization phase where people raise issues. And we completed that in early November and now it's undergoing vote in the OMG and it's going well. I haven't heard of a single dissenting vote.
When will you have the final results?
The final results I believe come in (after this) six- to eight-week period. So it'll probably be just before (the holidays). And then after that is an official rubber-stamping by the board of directors of the OMG, which should be somewhere in mid-January.
There are companies such as Borland that are already selling products that support UML 2.0. So if the products are already out there then what's the real significance of adopting the specification if people are implementing it already?
By the way, IBM of course is also supporting it, but basically what happened in all of those cases, or at least most of those cases, I can't say all of them, is that the standard was updated to follow the changes that we produce. All of the people you mention were actually involved with the finalization. The products were evolving as the standard was evolving. Most of them, certainly (our product), is up to date, because ours is basically generated from the standards and it's literally automatically generated from the standards.
What are you hearing from the developer community at large about UML 2.0 and how it's benefiting them or how they may be having a problem with it, if there are any problems?
Well, as you can imagine it's of course across the board. There are people who are repeating some criticisms which in my view are not justified, but in general there's a lot of interest. I just did a tour, I was in Korea for example last week, I was in Brazil the week before that. Four weeks before that I was in Europe, and I can certainly tell you there's a tremendous amount of interest in this and this is why I expect that model-driven development will actually (grow). The adoption of that will accelerate.
But one of the criticisms that you hear, I think the two most common criticisms that I hear, are, one, that it's too big, and two, that it's got no semantics, that it's mainly notation. Anybody who has actually looked at the standards would not claim that. The language is somewhat extensive but it is organized modularly so that, in fact, you don't actually have to learn the whole language in order to use it. You can learn a very small part of it to be effective with it. And then you can add on bits, or if you like, modules of it as need be. Just like you don't have to know all of English to be effective, the same thing is true in UML. And this is something we did in UML 2.0.
You're calling it a language. You still can use it with more common programming languages like C+ or C or Java, right?
Yes, you can. And a lot of people do that, I'd say a significant percentage of UML users are essentially using it to model traditional, typically object-oriented programming languages, but also in some cases legacy languages such as Cobol and so on. Those are being modeled. But yes, you can do that.
But languages such as Cobol, for example, don't have a concept such as a state machine, right? That's something that in some domains is a first order notion and yet if you're working with a language such as C or C#, you're going to have to build that in.
So UML has a state machine?
Yes, it has state machine. (In addition), what we added a lot in 2.0 is of course our software architecture descriptions, and here we basically took the queue from architectural description languages. We have some of the experts from Carnegie Mellon involved in the definition of that. The whole idea is to be able to model large-scale systems because that's where you need modeling most.
On large-scale systems such as what?
For example, I know of systems such as, say, large telecom systems which tend to be very complex systems consisting of tens of millions of lines of code.
What about, say, an enterprise banking application or something like that?
Oh sure, those actually don't tend to be as large individually, although they tend to be networked. And yes, you can model those things now quite accurately.
Can you describe what you mean by state machine and software architecture descriptions?
Software architecture is basically a description of a system in terms of its major components and how they interact. So it's an abstract view of the system such that it identifies the key relationships which, among other things, control the "evolveability" of the system, the maintainability, even the implementability. A good architecture typically uses a blueprint to drive the development of the system, but then later on it identifies how the system can evolve, if you like, or constrains how the system can evolve.
And what about state machines?
State machines are just one way of describing behavior that is common when you're dealing with event-driven behavior. So, for example, in distributed systems -- and you find those in banking applications and a lot of embedded real-time applications -- anything where events occur and you have to respond appropriately. And a state machine gives it the ability to respond differently based on what's happened before because it has a notion of history built into it.
Is automatic generation of code from the models a feature of UML 2.0? Or would that be a feature of a specific implementation of UML 2.0?
UML 2.0 provides the foundation for that, but UML 2.0 itself is not, for example, an executable language. You have to produce a specialization of it using typically something called a profile. And from that you can produce a profile, say, for EJB. You can produce a profile for a particular framework (where) there might be a domain-specific framework or a corporate-specific framework, things like that. So what you do is you specialize the language which gives you an interesting capability because it's a specialization of UML. You can use standard UML tools on these profiles.
You mentioned that there are some criticisms of UML by particular parties. I guess Microsoft (Profile, Products, Articles) is one of those parties. What are Microsoft's criticisms of UML and what are some of the other criticisms of UML?
There's the criticism that it has no semantics. One of the things we did in UML 2.0 was very, very carefully define the semantics of UML. In fact, I'd say maybe 60 to 70 percent of the work that was done on UML 2.0 was all under the hood for that very reason, to define the semantics much more precisely
So who's making the criticisms of UML?
There's a variety of people that I know, and then certainly companies like Microsoft have their own slice at it (that are) often, in my opinion, not justified based on a somewhat poor reason. For example, I recently read a blog from somebody who is at Microsoft claiming that you cannot take a subset of it, and that's just plain wrong. You might not be interested in state machines and you could simply ignore state machines, in fact, (and) work with a smaller (set) of UML concepts -- the ones that you think you need. And you can completely ignore the others and this is something that is actually supported in the language. You can say exclude this bit or these bits from my purviews so that I can only work with what I want.
So you can specialize then?
What kind of effect does it have that a vendor as critical as Microsoft is not supporting it?
My understanding is that they are supporting it in specific ways of using it, but I don't know, that's something you'll have to ask Microsoft. But I think it's somewhat unfortunate that certainly someone as significant as Microsoft has not participated in the definition of UML. They did a long ways back. But I think all that will do is it will introduce unnecessary confusion in the market. Especially since what they're doing is they're essentially taking UML and modifying it, at least from my understanding, and modifying it somewhat.
And you know, I was just reading this morning an article about how Airbus and Boeing, those two very competitors or competing organizations, have agreed now on a standard for supplying airplane parts. And they're saying, hey, it's going to be a US$400 million saving for each of those companies, just by the standard. So the fact that we still have some -- in my opinion -- unnecessary diversity is just going to make things a little more difficult for everyone.
You think Microsoft might have a problem supporting UML because they didn't invent it?
I can't talk about their motives -- to be honest, I really don't know. But I think that UML is certainly, from what I've seen -- I go to a lot of universities and talk to professors and I can tell you everywhere, every continent I've been to -- they teach UML as part of the undergraduate curriculum. So there're a lot of people who know about UML, there's a lot of support from multiple vendors, etc. So it's not a question of will UML succeed or not. It's already done that. The question is, what can we do to move it along?
What do you see in a follow-up to UML, say UML 3.0 or 2.1?
Well I think we'll certainly have a number of releases, just like traditional programming languages have had generations. But what I expect is as we gain more experience with model-driven development and we start developing a more comprehensive theory of modeling language design, I do expect that we will have (subsequent generations) of modeling languages coming along pretty soon.
That would be a successor to UML?
A successor to UML, and improving all the mistakes that I'm sure, you know, just like people who (wrote) Fortran (admit) they made mistakes, although they did a lot of good things as well. I'm sure we have similar kinds of mistakes, it's just that we don't have the theory to figure out what would be the right way to do these things. But I'm sure in four to five years' time we'll have the start of such a theory, and that's when I expect to see the next generation of modeling languages being (ready).
Do you see UML 2.0 as the last major version of the language then?
No, I think it'll be just like (how) we still have Fortran around. There's enough of a base for it that people will be doing it and using it for a long time to come. So there will be more versions of UML for sure. The question is, will the next version be something as dramatic as UML 2.0? Or will it be just an evolution? And the dramatic stuff will happen in the next language which would be a follow-on to UML.
What type of dramatic stuff?
It's hard to say at this point. We may learn better how to customize these languages so that we don't have this (Tower of Babel) situation. I think there will be a lot of reinvention going on because most of these languages need a certain core set of concepts.
Then you have this problem with all these different tools. Who's going to build tools for a language for insurance applications? Who's going to learn, who's going to write the books for this? Who's going to teach this stuff, etc.? And so there's a lot of value in a common base.
And, of course, if the common base is not appropriate, then other modeling languages will be added. And the OMG has actually added a capability for that. There is a standard that allows you to define new modeling languages. UML itself is defined in that language, and so far there's really only been two languages deemed necessary by the OMG, UML being one of them. Then there's about 10 different specializations, standard specializations of UML or UML profiles that are around.
So this is the direction we're moving. UML is in effect also quite conducive to developing domain-specific languages. So it's not that it's a question of domain-specific versus UML, that's not the issue I think.
How are you defining domain-specific?
Well a domain-specific language is a language that's invented that has abstractions that are close to that domain as (a) first-class concept. Let's say (I have) the notion of an insurance policy or something (that) is a first order of concern in my business. In my domain, I would like to be able to express that to get an insurance policy concept. If I can define a standard insurance business domain-specific language and standardize on that, then everyone's ahead.