The A-Z of Programming Languages: C++
- 25 June, 2008 21:50
Computerworld is undertaking a series of investigations into the most widely-used programming languages. Previously we have spoken to Alfred v. Aho of AWK fame, S. Tucker Taft on the Ada 1995 and 2005 revisions, Microsoft about its server-side script engine ASP, Chet Ramey about his experience maintaining Bash, and Charles H. Moore about the design and development of Forth.
In this interview, we chat to Bjarne Stroustrup of C++ fame about the design and development of C++, garbage collection and the role of facial hair in successful programming languages. Stroustrup is currently the College of Engineering Chair and Computer Science Professor at Texas A&M University, and is an AT&T labs fellow.
What prompted the development of C++?
I needed a tool for designing and implementing a distributed version of the Unix kernel. At the time, 1979, no such tool existed. I needed something that could express the structure of a program, deal directly with hardware, and be sufficiently efficient and sufficiently portable for serious systems programming.
You can find more detailed information about the design and evolution of C++ in my HOPL (History of Programming Languages) papers, which you can find on my home pages, and in my book "The Design and Evolution of C++".
Was there a particular problem you were trying to solve?
The two problems that stick in my mind were to simulate the inter-process communication infrastructure for a distributed or shared-memory system (to determine which OS services we could afford to run on separate processors), and [the need] to write the network drivers for such a system. Obviously - since Unix was written in C - I also wanted a high degree of C compatibility. Very early, 1980 onwards, it was used by other people (helped by me) for simulations of various network protocols and traffic management algorithms.
Where does the name C++ come from?
As "C with Classes" (my ancestor to C++) became popular within Bell Labs, some people found that name too much of a mouthful and started to call it C. This meant that they needed to qualify what they meant when they wanted to refer to Dennis Ritchie's language, so they used "Old C", "Straight C", and such. Somebody found that disrespectful to Dennis (neither Dennis nor I felt that) and one day I received a "request" though Bell Labs management channels to find a better name. As a result, we referred to C++ as C84 for a while. That didn't do much good, so I asked around for suggestions and picked C++ from the resulting list. Everybody agreed that semantically ++C would have been even better, but I thought that would create too many problems for non-geeks.
Were there any particularly difficult or frustrating problems you had to overcome in the development of the language?
Lots! For starters, what should be the fundamental design rules for the language? What should be in the language and what should be left out? Most people demand a tiny language providing every feature they have ever found useful in any language. Unfortunately, that's impossible.
After a short period of relying on luck and good taste, I settled on a set of "rules of thumb" intended to ensure that programs in C++ could be simultaneously elegant (as in Simula67, the language that introduced object-oriented programming) and efficient for systems programming (as in C). Obviously, not every program can be both and many are neither, but the intent was (and is) that a competent programmer should be able to express just about any idea directly and have it executed with minimal overheads (zero overheads compared to a C version).
Convincing the systems programming community of the value of type checking was surprisingly hard. The idea of checking function arguments against a function declaration was fiercely resisted by many - at least until C adopted the idea from C with Classes.
These days, object-oriented programming is just about everywhere, so it is hard for people to believe that I basically failed to convince people about it's utility until I finally just put in virtual functions and demonstrated that they were fast enough for demanding uses. C++'s variant of OOP was (and is) basically that of Simula with some simplifications and speedups.
C compatibility was (and is) a major source of both problems and strengths. By being C compatible, C++ programmers were guaranteed a completeness of features that is often missing in first releases of new languages and direct (and efficient) access to a large amount of code - not just C code, but also Fortran code and more because the C calling conventions were simple and similar to what other languages supported. After all, I used to say, reuse starts by using something that already exists, rather than waiting for someone developing new components intended for reuse. On the other hand, C has many syntactic and semantic oddities and keeping in lockstep with C as it evolved has not been easy.
What are the main differences between the original C with Classes and C++?
Most of the differences were in the implementation technique. C with Classes was implemented by a preprocessor, whereas C++ requires a proper compiler (so I wrote one). It was easy to transcribe C with Classes programs into C++, but the languages were not 100% compatible. From a language point of view, the major improvement was the provision of virtual functions, which enabled classical object-oriented programming. Overloading (including operator overloading) was also added, supported by better support for inlining. It may be worth noting that the key C++ features for general resource management, constructors and destructors, were in the earliest version of C with Classes. On the other hand, templates (and exceptions) were introduced in a slightly later version of C++ (1989); before that, we primarily used macros to express generic programming ideas.
Would you have done anything differently in the development of C++ if you had the chance?
This common question is a bit unfair because of course I didn't have the benefits of almost 30 years of experience with C++ then, and much of what I know now is the result of experimentation with the earlier versions of C++. Also, I had essentially no resources then (just me - part time) so if I grandly suggest (correctly) that virtual functions, templates (with "concepts" similar to what C++0x offers), and exceptions would have made C++85 a much better language, I would be suggesting not just something that I didn't know how to design in the early 1980s but also something that - if I magically had discovered the perfect design - couldn't have been implemented in a reasonable time.
I think that shipping a better standard library with C++ 1.0 in 1985 would have been barely feasible and would have been the most significant improvement for the time. By a "better library" I mean one with a library of foundation classes that included a slightly improved version of the (then available and shipping) task library for the support of concurrency and a set of container classes. Shipping those would have encouraged development of improved versions and established a culture of using standard foundation libraries rather than corporate ones.
Later, I would have developed templates (key to C++ style generic programming) before multiple inheritance (not as major a feature as some people seem to consider it) and emphasized exceptions more. However, "exceptions" again brings to a head the problem of hindsight. Some of the most important concepts underlying the modern use of templates on C++ did not exist until a bit later. For example the use of "guarantees" in describing safe and systematic uses of templates was only developed during the standardization of C++, notably by Dave Abrahams.
How did you feel about C++ becoming standardized in 1998 and how were you involved with the standardization process?
I worked hard on that standard for years (1989-1997) - as I am now working on its successor standard: C++0x. Keeping a main-stream language from fragmenting into feuding dialects is a hard and essential task. C++ has no owner or "sugar daddy" to supply development muscle, "free" libraries, and marketing. The ISO standard committee was essential for the growth of the C++ community and that community owes an enormous amount to the many volunteers who worked (and work) on the committee.
What is the most interesting program that you've seen written with C++?
I can't pick one and I don't usually think of a program as interesting. I look more at complete systems - of which parts are written in C++. Among such systems, NASA's Mars Rovers' autonomous driving sub-system, the Google search engine, and Amadeus' airline reservation system spring to mind. Looking at code in isolation, I think Alexander Stepanov' STL (the containers, iterators, and libraries part of the C++ standard library) is among the most interesting, useful, and influential pieces of C++ code I have ever seen.
Have you ever seen the language used in a way that was not originally intended?
I designed C++ for generality. That is, the features were deliberately designed to do things I couldn't possibly imagine - as opposed to enforce my views of what is good. In addition, the C++ abstraction facilities (e.g., classes and templates) were designed to be optimally fast when used on conventional hardware so that people could afford to build the basic abstractions they need for a given application area (such as complex numbers and resource handles) within the language.
So, yes, I see C++ used for many things that I had not predicted and used in many ways that I had not anticipated, but usually I'm not completely stunned. I expected to be surprised, I designed for it. For example, I was very surprised by the structure of the STL and the look of code using it - I thought I knew what good container uses looked like. However, I had designed templates to preserve and use type information at compile time and worked hard to ensure that a simple function such as less-than could be inlined and compiled down to a single machine instruction. That allowed the "weaving" of separately defined code into efficient executable code, which is key to the efficiency of the STL. The biggest surprise, I guess, was that the STL matched all but one of a long list of design criteria for a general purpose container architecture that I had compiled over the years, but the way STL code looked was entirely unexpected.
So I'm often pleased with the surprises, but many times I'm dismayed at the attempts to force C++ into a mold for which it is not suited because someone didn't bother to learn the basics of C++. Of course, people doing that don't believe that they are acting irrationally; rather, they think that they know how to program and that there is nothing new or different about C++ that requires them to change their habits and learn "new tricks." People who are confident in that way structure the code exactly as they would for, say, C or Java and are surprised when C++ doesn't do what they expect. Some people are even angry, though I don't see why someone should be angry to find that they need to be more careful with the type system in C++ than in C or that there is no company supplying 'free" and "standard" libraries for C++ as for Java. To use C++ well, you have to use the type system and you have to seek out or build libraries. Trying to build applications directly on the bare language or with just the standard library is wasteful of your time and effort. Fighting the type system (with lots of casts and macros) is futile.
It often feels like a large number of programmers have never really used templates, even if they are C++ programmers.
You may be right about that, but many at least - I think most - are using the templates through the STL (or similar foundation libraries) and I suspect that the number of programmers who avoid templates is declining.
Why do you think this is?
Fear of what is different from what they are used to, rumors of code bloat, potential linkage problems, and spectacular bad error messages.
Do you ever wish the GNU C++ compiler provided shorter compiler syntax errors so as to not scare uni students away?
Of course, but it is not all GCC's fault. The fundamental problem is that C++98 provides no way for the programmer to directly and simply state a template's requirements on its argument types. That is a weakness of the language - not of a complier - and can only be completely addressed through a language change, which will be part of C++0x.
I'm referring to "concepts" which will allow C++0x programmers to precisely specify the requirements of sets of template arguments and have those requirements checked at call points and definition points (in isolation) just like any other type check in the language. For details, see any of my papers on C++0x or "Concepts: Linguistic Support for Generic Programming in C++" by Doug Gregor et al (including me) from OOPSLA'06 (available from my publications page). An experimental implementation can be downloaded from Doug Gregor's home pages.
Until "concepts" are universally available, we can use "constrains classes" to dramatically improve checking; see my technical FAQ.
The STL is one of the few (if not the only) general purpose libraries for which programmers can actually see complexity guarantees. Why do you think this is?
The STL is - as usual - ahead of its time. It is hard work to provide the right guarantees and most library designers prefer to spend their efforts on more visible features. The complexity guarantees is basically one attempt among many to ensure quality.
In the last couple of years, we have seen distributed computing become more available to the average programmer. How will this affect C++?
That's hard to say, but before dealing with distributed programming, a language has to support concurrency and be able to deal with more than the conventional "flat/uniform" memory model. C++0x does exactly that. The memory model, the atomic types, and the thread local storage provides the basic guarantees needed to support a good threads library. In all, C++0x allows for the basic and efficient use of multi-cores. On top of that, we need higher-level concurrency models for easy and effective exploitation of concurrency in our applications. Language features such as "function objects" (available in C++98) and lambdas (a C++0x feature) will help that, but we need to provide support beyond the basic "let a bunch of threads loose in a common address space" view of concurrency, which I consider necessary as infrastructure and the worst possible way of organizing concurrent applications.
As ever, the C++ approach is to provide efficient primitives and very general (and efficient) abstraction mechanisms, which is then used to build higher-level abstractions as libraries.
Of course you don't have to wait for C++0x to do concurrent programming in C++. People have been doing that for years and most of what the new standard offers related to concurrency is currently available in pre-standard forms.
Do you see this leading to the creation to a new generation of general purpose languages?
Many of the "scripting languages" provide facilities for managing state in a Web environment, and that is their real strength. Simple text manipulation is fairly easily matched by libraries, such as the new C++ regular expression library (available now from boost.org) but it is hard to conceive of a language that is both general-purpose and distributed. The root of that problem is that convenient distributed programming relies on simplification and specialization. A general-purpose language cannot just provide a single high-level distribution model.
I see no fundamental reason against a general-purpose language being augmented by basic facilities for distribution, however, and I (unsuccessfully) argued that C++0x should do exactly that. I think that eventually all major languages will provide some support for distribution through a combination of direct language support, run-time support, or libraries.
Do you feel that resources like the boost libraries will provide this functionality/accessibility for C++?
Some of the boost libraries - especially the networking library - are a good beginning. The C++0x standard threads look a lot like boost threads. If at all possible, a C++ programmer should begin with an existing library (and/or tool), rather than building directly on fundamental language features and/or system threads.
In your opinion, what lasting legacy has C++ brought to computer development?
C++ brought object-oriented programming into the main stream and it is doing the same for generic programming.
If you look at some of the most successful C++ code, especially as related to general resource management, you tend to find that destructors are central to the design and indispensible. I suspect that the destructor will come to be seen as the most important individual contribution - all else relies on combinations of language features and techniques in the support of a programming style or combinations of programming styles.
Another way of looking at C++'s legacy is that it made abstraction manageable and affordable in application areas where before people needed to program directly in machine terms, such as bits, bytes, words, and addresses.
In the future, I aim for a closer integration of the object-oriented and generic programming styles and a better articulation of the ideals of generality, elegance, and efficiency.
Where do you envisage C++'s future lying?
Much of where C++ has had its most significant strength since day #1: applications with a critical systems programming component, especially the provision of infrastructure. Today, essentially all infrastructures (including the implementation of all higher-level languages) are in C++ (or C) and I expect that to remain the case. Also, embedded systems programming is a major area of use and growth of C++; for example, the software for the next generation US fighter planes are in C++ (see the JSF++ coding rules on my home pages). C++ provides the most where you simultaneously need high performance and higher-level abstractions, especially under resource constraints. Curiously, this description fits both an iPod and a large-scale scientific application.
Has the evolution and popularity of the language surprised you in anyway?
Nobody, with the possible exception of Al Aho (of Dragon book fame), foresaw the scale of C++'s success. I guess that during the 1980s I was simply too busy even to be surprised: The use of C++ doubled every 7.5 months, I later calculated - and that was done without a dedicated marketing department, with hardly any people, and on a shoestring budget. I aimed for generality and efficiency and succeeded beyond anyone's expectations.
By the way, I occasionally encounter people who assume that because I mildly praise C++ and defend it against detractors, I must think it's perfect. That's obviously absurd. C++ has plenty of weaknesses - and I know them better than most - but the whole point of the design and implementation exercise was not to make no mistakes (that's impossible on such a large scale and under such Draconian design constraints). The aim was to produce a tool that - in competent hands - would be effective for serious real-world systems building. In that, it succeeded beyond my wildest dreams.
How do you respond to criticism of the language, such as that it has inherited the flaws of C and that it has a very large feature set which makes it 'bloated'?
C++ inherited the weaknesses and the strengths of C++, and I think that we have done a decent job at compensating for the weaknesses without compromising the strengths. C is not a simple language (its ISO standard is more than 500 pages) and most modern languages are bigger still. Obviously, C++ (like C) is "bloated" compared to toy languages, but not really that big compared to other modern languages. There are solid practical reasons why all the languages used for serious industrial work today are "bloated" - the tasks for which they are used are large and complex beyond the imaginations of ivory tower types.
Another reason for the unpleasantly large size of modern language is the need for stability. I wrote C++ code 20 years ago that still runs today and I'm confident that it will still compile and run 20 years from now. People who build large infrastructure projects need such stability. However, to remain modern and to meet new challenges, a language must grow (either in language features or in foundation libraries), but if you remove anything, you break code. Thus, languages that are built with serious concern for their users (such as C++ and C) tend to accrete features over the decades, tend to become bloated. The alternative is beautiful languages for which you have to rewrite your code every five years.
Finally, C++ deliberately and from day #1 supported more than one programming style and the interaction of those programming styles. If you think that there is one style of programming that is best for all applications and all people - say, object-oriented programming - then you have an opportunity for simplification. However, I firmly believe that the best solutions - the most readable, maintainable, efficient, etc., solutions - to large classes of problems require more than one of the popular programming styles - say, both object-oriented programming and generic programming - so the size of C++ cannot be minimized by supporting just one programming style. This use of combinations of styles of programming is a key part of my view of C++ and a major part of its strength.
What are you proudest of in terms of the language's initial development and continuing use?
I'm proud that C++ has been used for so many applications that have helped make the world a better place. Through C++, I have made a tiny contribution to the human genome project, to high energy physics (C++ is used at CERN, Fermilab, SLAC, etc.), space exploration, wind energy, etc. You can find a short list of C++ applications on my home pages. I'm always happy when I hear of the language being put to good use.
Secondly, I'm proud that C++ has helped improve the level of quality of code in general - not just in C++. Newer languages, such as Java and C#, have been used with techniques that C++ made acceptable for real-world use and compared to code 20 years ago many of the systems we rely on today are unbelievably reliable and have been built with a reasonable degree of economy. Obviously, we can and should do better, but we can take a measure of pride in the progress we have made so far.
In terms of direct personal contribution, I was pleased to be able to write the first C++ compiler, Cfront, to be able to compile real-world programs in 1MB on a 1MHz machine. That is of course unbelievably small by today's standard, but that is what it took to get higher-level programming started on the early PCs. Cfront was written in C with Classes and then transcribed into (early) C++.
Where do you see computer programming languages heading in the near future?
"It is hard to make predictions, especially about the future." Obviously, I don't really know, but I hope that we'll see general-purpose programming languages with better abstraction mechanisms, better type safety, and better facilities for exploiting concurrency. I expect C++ to be one of those. There will also be bloated corporate infrastructures and languages; there will be special purpose (domain specific) languages galore, and there will be languages as we know them today persisting essentially unchanged in niches. Note that I'm assuming significant evolution of C++ beyond C++0x. I think that the C++ community is far too large and vigorous for the language and its standard library to become essentially static.
Do you have any advice for up-and-coming programmers?
Know the foundations of computer science: algorithms, machine architectures, data structures, etc. Don't just blindly copy techniques from application to application. Know what you are doing, that it works, and why it works. Don't think you know what the industry will be in five years time or what you'll be doing then, so gather a portfolio of general and useful skills. Try to write better, more principled code. Work to make "programming" more of a professional activity and less of a low-level "hacking" activity (programming is also a craft, but not just a craft). Learn from the classics in the field and the better advanced textbooks; don't be satisfied with the easily digested "how to" guides and online documentation - it's shallow.
There's a section of your homepage devoted to 'Did you really say that?' Which quote from this has come back to haunt you the most?
I don't feel haunted. I posted those quotes because people keep asking me about them, so I felt I had better state them clearly. "C++ makes it harder to shoot yourself in the foot; but when you do, it takes off the whole leg" is sometimes quoted in a manner hostile to C++. That just shows immaturity. Every powerful tool can cause trouble if you misuse it and you have to be more careful with a powerful tool than with a less powerful one: You can do more harm (to yourself or others) with a car than with a bicycle, with a power saw than with a hand saw, etc. What I said in that quote is also true for other modern languages; for example, it is trivial to cause memory exhaustion in a Java program. Modern languages are power tools. That's a reason to treat them with respect and for programmers to approach their tasks with a professional attitude. It is not a reason to avoid them, because the low-level alternatives are worse still.
Time for an obligatory question about garbage collection, as we're almost at the end, and you seem to get questions about this all the time. Why do you think people are so interested in this aspect of the language?
Because resource management is a most important topic, because some people (wrongly) see GC as a sure sign of sloppy and wasteful programming, and because some people (wrongly) see GC as the one feature that distinguishes good languages from inferior ones. My basic view is that GC can be a very useful tool, but that it is neither essential nor appropriate for all programs, so that GC should be something that you can optionally use in C++. C++0x reflects that view.
My view of GC differs from that of many in that I see it as a last resort of resource management, not the first, and that I see it as one tool among many for system design rather than a fundamental tool for simplifying programming.
How do you recommend people handle memory management in C++?
My recommendation is to see memory as just one resource among many (e.g. thread handles, locks, file handles, end sockets) and to represent every resource as an object of some class. For example, memory may be used to hold elements of a container or characters of a string, so we should use types such as vector
Wherever possible, I recommend the use of such "resource handles" simply as scoped variables. In that case, there is no explicit memory management that a programmer can get wrong. When an object's lifetime cannot easily be scoped, I recommend some other simple scheme, such as use of "smart" pointers (appropriate ones provided in C++0x) or representing ownership as membership in some collection (that technique can be used in embedded systems with Draconian time and space requirements). These techniques have the virtues of applying uniformly to all kinds of resources and integrating nicely with a range of error-handling approaches.
Only where such approaches become unmanageable - such as for a system without a definite resource management or error handling architecture or for a system littered with explicit allocation operations - would I apply GC. Unfortunately, such systems are very common, so I consider this is a very strong case for GC even though GC doesn't integrate cleanly with general resource management (don't even think of finalizes). Also, if a collector can be instrumented to report what garbage it finds, it becomes an excellent leak detector.
When you use scoped resource management and containers, comparatively little garbage is generated and GC becomes very fast. Such concerns are behind my claim that "C++ is my favorite garbage collected language because it generates so little garbage."
I had hoped that a garbage collector which could be optionally enabled would be part of C++0x, but there were enough technical problems that I have to make do with just a detailed specification of how such a collector integrates with the rest of the language, if provided. As is the case with essentially all C++0x features, an experimental implementation exists.
There are many aspects of garbage collection beyond what I mention here, but after all, this is an interview, not a textbook.
On a less serious note, do you think that facial hair is related to the success of programming languages?
I guess that if we look at it philosophically everything is related somehow, but in this case we have just humor and the fashion of the times. An earlier generation of designers of successful languages was beardless: Backus (Fortran), Hopper (COBOL), and McCarthy (Lisp), as were Dahl and Nygaard (Simula and Object-Oriented Programming). In my case, I'm just pragmatic: While I was living in colder climates (Denmark, England, and New Jersey), I wore a beard; now I live in a very hot place, Texas, and choose not to suffer under a beard. Interestingly, the photo they use to illustrate an intermediate stage of my beard does no such thing. It shows me visiting Norway and reverting to cold-weather type for a few days. Maybe there are other interesting correlations? Maybe there is one between designer height and language success? Maybe there is a collation between language success and appreciation of Monty Python? Someone could have fun doing a bit of research on this.
Finally, is there anything else you'd like to add?
Yes, I think we ought to consider the articulation of ideas and education. I have touched upon those topics a couple of times above, but the problems of getting people to understand what C++ was supposed to be and how to use it well were at least as difficult and time consuming as designing and implementing it. It is pointless to do good technical work and then not tell people about it. By themselves, language features are sterile and boring; to be useful, programmers have to learn how language features can be used in combination to serve some ideal of programming, such as object-oriented programming and generic programming.
I have of course written many purely technical papers, but much of my writing have been aimed at raising the level abstraction of programs, to improve the quality of code, and to give people an understanding of what works and why. Asking programmers to do something without giving a reason is treating them like small children - they ought to be offended by that. The editions of "The C++ Programming Language", D&E, "Teaching Standard C++ as a New Language", and my HOPL papers, are among my attempts to articulate my ideals for C++ and to help the C++ community mature. Of course, that has been only partially successful - there is still much "cut and paste programming" being done and no shortage of poor C++ code - but I am encourage by the amount of good code and the number of quality systems produced.
Lately, I have moved from industry to academia and now see the education problems from a different angle. We need to improve the education of our software developers. Over the last three years, I have developed a new course for freshmen (first-year students, often first-time programmers). This has given me the opportunity to address an audience I have never before known well and the result is a beginner's textbook "Programming: Principles and Practice using C++" which will be available in October.