A to Z of programming languages: Smalltalk-80

We talk to Alan Kay, co-inventor of the Smalltalk family of languages, and a hero of personal computing in many respects
Smalltalk-80 co-creator, Alan Kay

Smalltalk-80 co-creator, Alan Kay

Computerworld Australia is undertaking a series of investigations into the most widely-used programming languages. We most recently spoke to Brad Cox, the man behind everyone's favourite Apple-flavoured language – Objective-C. Make sure to check out The A to Z of programming languages index to find them all.

This week, we take a look at the pre-cursor to Objective-C and the foundation of much of modern programming today: Smalltalk-80. One of the men behind the language, Alan Kay, is credited not only with helping to develop the language, but also the invention of object-oriented programming as a concept, and even inventing a personal computer concept that has eerie similarities to the iPad.

Smalltalk-80 was one of several Smalltalk languages Kay helped to shape while at Xerox's Palo Alto Research Centre, now known simply as PARC. The languages focussed on personal computing – a topic Kay still feels strongly about – and here he expands on how the work came about, the state of innovation in the modern era and the love for education he continues to hold.

Alan, you're credited with inventing the phrase "object-oriented programming (OOP)". Did the concept exist at all at the time?

I did make up this term (and it was a bad choice because it under-emphasized the more important idea of message sending). Part of the idea existed (in several systems). I could see that a more comprehensive basis could be made by going all the way to thinking of efficient whole virtual machines communicating only by messages. This would provide scaling, be a virtual version of what my research community, ARPA-IPTO [The Information Processing Techniques Office at the US Department of Defense's research facility] was starting to do with large scale networking, and also would have some powerful “algebraic” properties (like polymorphism).

Why do you think messaging was more important than object-oriented programming in Smalltalk-80?

[Marshall] McLuhan said that most people can only experience the present in terms of the past. So “new” gets turned into “news”. If it can’t, for most people, “new” is rejected unless there is no other way. Otherwise the new is filtered down into news. One of the pieces of news in OOP is that you can simulate data (in what are called “abstract data types”), and this is sometimes useful to do, but it is not the essence in any way of object oriented design.

C++ was very popular because it had a familiar (bad) syntax, and you didn’t have to learn to do OOP in order to feel au courant.

Real OOP design is very different than the previous “data-structure-and-procedure” style. And it is also true that none of the Smalltalks were really great for this either, though it was at least possible to think it and do it.

Do you think "real OOP design" was ever achieved? Is it entirely necessary anymore?

I think “real design” in terms of protected and interchangeable modules to make highly scalable systems has not been achieved yet, and is desperately needed. However, Smalltalk at its best was only a partial solution. For example, by the end of the 70s I was writing papers about why we should be “pulling” rather than “pushing”, and this was a return to some of the pattern directed stuff I had liked from Carl Hewitt in the 60s.

The difference was that I thought of the “pulling” as a kind of universal retrieval mechanism or “call by need”. This was influenced by forward inferencing (in PLANNER and OPS5), by the recent invention of spreadsheets (which I really loved), and a little later by Gelernter’s invention of LINDA. All of these provided ways of asking/telling the environment of a module what external resources it needed to do its job. I wrote about this in the September 1984 issue of Scientific American and in other papers at the time.

Page Break

Are there any aspects of the Smalltalk-80 language that you don't feel were fully developed or completed during your involvement?

Quite a bit of the control domain was unrealized, even with respect to the original plans. And also, the more general notions of what it was you were doing when you were programming did not get fleshed out as originally planned. My original conception of Smalltalk aimed to be a felicitous combination of a number of language ideas that I thought would be hugely powerful for both children and adults.

Besides the object ideas, I wanted the simplicity of LOGO, the higher levels of expression from Carl Hewitt’s PLANNER, the extensibility of Dave Fisher’s CDL and my earlier FLEX language.

While this was happening, the famous “bet” caused a much simpler more LISP-like approach to “everything” that took a few weeks to invent and Dan Ingalls a month to implement. This provided a very useful working system just at the time that the Alto started working. We got into making a lot of personal computing ideas work using this system and never went back to some of the (really good) ideas for the early Smalltalk.

This was good in many ways, but did not get to where I thought programming should go at that time (or today). Doug Lenat at Stanford in the mid to late 70s did a number of really interesting systems that had much more of the character of “future programming”.

What contribution do you feel you made to successive programming languages like Objective-C and C++?

The progression from the first Smalltalk to the later Smalltalks was towards both efficiency and improved programming tools, not better expression. And I would term both Objective-C and especially C++ as less object oriented than any of the Smalltalks, and considerably less expressive, less safe, and less amenable to making small compact systems.

C++ was explicitly not to be like Smalltalk, but to be like Simula. Objective C tried to be more like Smalltalk in several important ways.

However, I am no big fan of Smalltalk either, even though it compares very favourably with most programming systems today (I don’t like any of them, and I don’t think any of them are suitable for the real programming problems of today, whether for systems or for end-users).

How about computer programming as a discipline?

To me, one of the nice things about the semantics of real objects is that they are “real computers all the way down (RCATWD)” – this always retains the full ability to represent anything. The old way quickly gets to two things that aren’t computers – data and procedures – and all of a sudden the ability to defer optimizations and particular decisions in favour of behaviours has been lost.

In other words, always having real objects always retains the ability to simulate anything you want, and to send it around the planet. If you send data 1000 miles you have to send a manual and/or a programmer to make use of it. If you send the needed programs that can deal with the data, then you are sending an object (even if the design is poor).

And RCATWD also provides perfect protection in both directions. We can see this in the hardware model of the Internet (possibly the only real object-oriented system in working order).

You get language extensibility almost for free by simply agreeing on conventions for the message forms.

My thought in the 70s was that the Internet we were all working on alongside personal computing was a really good scalable design, and that we should make a virtual internet of virtual machines that could be cached by the hardware machines. It’s really too bad that this didn’t happen.

Page Break

Though a lot has happened in the past 30 years, how do you feel computer programming and engineering has changed as a discipline? Is there still the space and capacity to innovate in programming languages as there was in the 1970s?

There is certainly considerable room for improvement! The taste for it and the taste for inventing the improvements doesn’t seem to be there (or at least as strongly as it was in the 60s). Academia in particular seems to have gotten very incremental and fad oriented, and a variety of factors (including non-visionary funding) make it very difficult for a professor and a few students to have big ideas and be able to make them. This is a huge problem.

The Xerox Palo Alto Research Centre (PARC) seems to have been a bit of beehive of development and innovation in the 1970s and 80s, and formed the basis of modern computers as we know them today. Have you seen the ICT industry change significantly in terms of a culture of innovation and development?

It is fair to characterize much of what has happened since 1980 as “pretty conservative commercialization of some of the PARC inventions”. Part of the slowdown in new invention can be ascribed to the big negative changes in government funding, which in the 60s especially was able to fund high-risk, high-reward research.

I don’t see anything like PARC today in any country, company or university. There are good people around from young to old, but both the funding and academic organizations are much more incremental and conservative today.

Is there a chance at revival of those innovative institutions of the 60s? Are we too complacent towards innovation?

One part of a “revival” could be done by simply adding back a category of funding and process that was used by ARPA-IPTO in the 60s (and other great funders such as the Office of Naval Research). Basically, “fund people, not projects”, “milestones, rather than deadlines”, “visions rather than goals”. The “people not projects” part meant “super top people”, and this limited the number who could be funded (and hence also kept the funding budget relatively low).

The two dozen or so scientists who went to Xerox PARC had their PhDs funded by ARPA in the 60s, and so we were the second generation of the “personal computing and pervasive networks” vision. In today’s dollars these two dozen (plus staff support and equipment, which was more expensive back then) would cost less than $15 million dollars per year. So this would be easy for any large company or government funder to come up with.

There are several reasons why they haven’t done it. I think in no small part that today’s funders would much rather feel very much in control of mediocre processes that will produce results (however mediocre) rather than being out of control with respect to processes that are very high risk and have no up front guarantees or promises (except for “best effort”).

The other part of this kind of revival has to do with the longitudinal dimensions. Basically the difference between hunting and gathering, and agriculture. The really hard projects that can’t be solved by “big engineering” require some “growing” of new ideas and of new people. Xerox PARC really benefitted from ARPA having grown us as grad students who had “drunk the Kool-Aid” early, and had deep inner determinations to do the next step (whatever that was) to make personal computing and pervasive networking happen.

A lot of the growth dynamics has to do with processes and products that have rather slight connections with the goals. For example, the US space program was done as a big engineering project and was successful, but failed to invent space travel (and probably set space travel back by 30-50 years). However, the Congress and public would not have stood for spending a decade or more trying to invent (say) atomic powered engines that could make interplanetary travel much more feasible.

Nobody really cared about interactive computing in the 60s, and the ARPA funding for it was relatively small compared to other parts of the Department of Defense effort against the Russians. So quite a lot got done in many directions, including making the postdocs who would eventually succeed at the big vision.

Objective-C's co-creator, Brad Cox said he saw the future of computer programming in reassembling existing libraries and components, rather than completely fresh coding with each new project. Do you agree?

I think this works better in the physical world and really requires more discipline that computerists can muster right now to do it well in software. However, some better version of it is definitely part of the future.

For most things, I advocate using a dynamic language of very high level and doing a prototype from scratch in order to help clarify and debug the design of a new system – this includes extending the language to provide very expressive forms that fit what is being attempted.

We can think of this as “the meaning” of the system. The development tools should allow any needed optimizations of the meaning to be added separately so that the meaning can be used to test the optimizations (some of which will undoubtedly be adapted from libraries).

In other words, getting the design right – particularly so the actual lifecycle of what is being done can be adapted to future needs – is critical, and pasting something up from an existing library can be treacherous.

The goodness of the module system and how modules are invoked is also critical. For example, can we find the module we need without knowing its name? Do we have something like “semantic typing” so we can find what we “need” – i.e. if the sine function isn’t called “sine” can the system find it for us, etc.?

Page Break

Is a high-level dynamic language a one-size-fits-all solution for the community's problems, or do you think languages are likely to fragment further?

One of the biggest holes that didn’t get filled in computing is the idea of “meta” and what can be done with it. The ARPA/PARC community was very into this, and a large part of the success of this community had to do with its sensitivity to meta and how it was used in both hardware and software.

“Good meta” means that you can take new paths without feeling huge burdens of legacy code and legacy ideas.

We did a new Smalltalk every two years at PARC, and three quite different designs in eight years – and the meta in the previous systems was used to build the next one. But when Smalltalk-80 came into the regular world of programming, it was treated as a programming language (which it was) rather than a meta-language (which it really was), and very little change happened there after.

Similarly, the hardware we built at PARC was very meta, but what Intel and Motorola etc., were putting into commercial machines could hardly have been less meta. This made it very difficult to do certain important new things efficiently (and this is still the case).

As well as Smalltalk-80, you're often associated with inventing a precursor to the iPad, the Dynabook. Do you feel the personal computer has reached the vision you had in 1972, and where do you see it heading in the future?

The Dynabook was/is a service idea embodied in several hardware ideas and with many criteria for the kinds of services that it should provide to its users, especially children. It is continually surprising to me that the service conceptions haven’t been surpassed many times over by now, but quite the opposite has happened, partly because of the unholy embrace between most people’s difficulties with “new” and of what marketeers in a consumer society try to do.

What are the hurdles to those leaps in personal computing technology and concepts? Are companies attempting to redefine existing concepts or are they simply innovating too slowly?

It’s largely about the enormous difference between “News” and “New” to human minds. Marketing people really want “News” (= a little difference to perk up attention, but on something completely understandable and incremental). This allows News to be told in a minute or two, yet is interesting to humans. “New” means “invisible” “not immediately comprehensible”, etc.

So “New” is often rejected outright, or is accepted only by denaturing it into “News”. For example, the big deal about computers is their programmability, and the big deal about that is “meta”.

For the public, the News made out of the first is to simply simulate old media they are already familiar with and make it a little more convenient on some dimensions and often making it less convenient in ones they don’t care about (such as the poorer readability of text on a screen, especially for good readers).

For most computer people, the News that has been made out of New eliminates most meta from the way they go about designing and programming.

One way to look at this is that we are genetically much better set up to cope than to learn. So familiar-plus-pain is acceptable to most people.

You have signalled a key interest in developing for children and particularly education, something you have brought to fruition through your involvement in the One Laptop Per Child (OLPC) project, as well as Viewpoints Research Institute. What is your view on the use of computing for education?

I take “Education” in its large sense of helping people learn how to think in the best and strongest ways humans have invented. Much of these “best and strong” ways have come out of the invention of the processes of science, and it is vital for them to be learned with the same priorities that we put on reading and writing.

When something is deemed so important that it should be learned by all (like reading) rather than to be learned just by those who are interested in it (like baseball), severe motivation problems enter that must be solved. One way is to have many more peers and adults showing great interest in the general ideas (this is a bit of a chicken and egg problem). Our society generally settles for the next few lower things on the ladder (like lip service by parents about “you need to learn to read” given often from watching TV on the couch).

When my research community were working on inventing personal computing and the Internet, we thought about all these things, and concluded that we could at least make curricula with hundreds of different entry points (in analogy to the San Francisco Exploratorium or Whole Earth Catalog), and that once a thread was pulled on it could supply enough personal motivation to help get started.

At this point, I still think that most people depend so much on the opinion of others about what it is they should be interested in, that we have a pop culture deadly embrace which makes it very difficult even for those who want to learn to even find out that there exists really good stuff. This is a kind of “Gresham’s Law for Content”.