Ian Murdock: From Debian to clustering

The ian in Debian Linux stands for Ian Murdock, a former research staff member at the University of Arizona and coauthor of the Swarm storage system. He's now president and Chief Executive Officer (CEO) of Progeny Linux Systems, headquartered in Indianapolis. Progeny is commercializing Progeny Debian and the Linux NOW clustering system, which emphasizes manageability at least as much as performance, the traditional selling feature of Linux clusters.

Last month we held an online discussion with Murdock in ITworld.com's Interviews forum. This is a partial transcript of that interview. To read the full interview, including comments from LinuxWorld.com readers, follow the link in the Resources section below.

Progeny's purpose

LinuxWorld.com: Ian, what are you after? There are a lot of companies selling clustering hardware and software; several of them had a presence at February's LinuxWorld Expo. I understand you regard NOW as distinct from any of the rest. What message do you want to get across about NOW, and how will you know when the IT world at large "gets it"?

Ian Murdock: Linux NOW as a technology is very similar to clustering in many ways, but very different in others. It is similar in how it operates: as in clustering systems, NOW takes a network of computers and builds a larger abstraction above it. The difference comes about in what that network of computers is designed to do. Clusters are normally very specialized things that sit in a machine room and do nothing but perform some very specialized task. For example, a Beowulf cluster is all about number crunching and computationally intensive things.

Linux NOW is all about the user workstations. Its primary goals are to make administration of the network easier, to make it easier for users of the network to share resources, to build a consistent environment for users that allows them to work more productively. Of course, system administration and management, shared storage, and so on are all very important components of a cluster, so there's certainly room for NOW in the clustering arena too.

As for when the IT world at large will get it, I think they already get it, they just don't know what to do about it. Network management has been a disaster for 15 years. The classic approach to network management has been to throw manpower at the problem, lots of it, and lots of Perl scripts to glue the mishmash together. Every site I've ever seen has come up with its own solutions. Clearly, these sorts of solutions are suboptimal and certainly don't scale. There are tools now that make the problem a little more approachable, but these new tools don't address the root of the problem -- that each computer on the network has its own identity, its own configuration, its own resources. They mask the problem, try to make it less noticeable. We believe the right approach is to address the problem at a fundamental level, in the OS, to make the network look like one logical system. And that's the approach we're taking in Linux NOW.

LinuxWorld.com: Could you explain more of NOW as a technology. When someone says, "at a fundamental level in the OS," I usually think, "Uh-oh -- we're talking about unmaintainable, proprietary kernel patches, just the kind of thing to make the cure worse than the disease." On the other hand, I've also heard you speak eloquently about how NOW respects existing investments -- it works with what's already in place. How do you pull all this off?

Ian Murdock: Linux NOW makes a network of Linux workstations look and operate like a single system. The network looks like one big timesharing system rather than the collection of little timesharing systems that it actually is. There is a single filesystem shared by all workstations, down to /, /etc, and /tmp. Thus, from an administrator's point of view, there is just one system to manage rather than an entire network. And it is irrelevant to the user which computer on the network he or she is logged in to because the user's environment is the same regardless of location, again because of the shared filesystem.

In other words, Linux NOW provides what could be called a single system image (SSI), though we're mostly interested in SSI with respect to the filesystem. We're dealing with Unix, and the filesystem is the central abstraction in Unix. So, if you get the filesystem right, most other features fall into place nicely.

The other piece of SSI that NOW provides is process migration, which allows processes to move around the network to take advantage of idle resources. So, NOW is a filesystem, a process migration facility, and a set of changes to a Linux distribution to make the SSI work. It's a layer above Linux. When I say that NOW respects existing investments, I mean that we have designed the system from the beginning to integrate multiple hardware platforms together into the SSI, though we're certainly not there yet. Application compatibility is also a big concern, but that's a concern shared by all Linux vendors. Compatibility with existing infrastructure is extremely important because no matter how great of a system NOW turns out to be, no one is going to use it if they have to throw everything out the window to do so.

The primary benefit of NOW is its shared filesystem. For administrators, it reduces the problem of managing a large network of many machines to a much more approachable problem, that of managing a single system. For end users, whether the workload is comprised of office productivity applications or engineering tools, it allows resources to be easily shared, and it builds a consistent environment. It's a system that's fundamentally designed to be general purpose, useful in everything from a large office LAN to a home network of a few machines.

In a traditional network of Unix workstations, the boundaries between workstations are still visible. You have many systems. Each system has its own disk, its own administrative files, its own set of local resources. Over the years, many tools have been written to help stitch the network together. NFS and NIS are good examples of such tools. And, sure, you can stitch together the network reasonably well with tools like that. Take NFS and create file shares. Take NIS and get single sign-on. Take rdist and push all the administrative files to all the workstations every night at 2 a.m. And on and on. Yeah, it works, but it's horribly complicated, prone to error, and doesn't scale. And what happens when the guy who set it up leaves for greener pastures? Can the next person figure out how it all works? So, the primary benefit of SSI is simplicity. The single system abstraction is one that everyone understands. How do you manage a single system? How do you manage a network of 200 systems? The former is a lot easier question to answer, and that's what motivates us.

More details, please

LinuxWorld.com: So how is this SSI filesystem handled? NFS or something else? Are FIFOs also SSI? What about sockets? Surely they are not transparent? Will these enhancements be open? And what about platforms beyond x86?

Ian Murdock: We're implementing a new filesystem called Pelican; NFS isn't up to the task. And, yes, we do support named pipes and sockets in the filesystem. To do SSI properly, Unix semantics have to be implemented precisely. The fact that an application is running on a network filesystem has to be completely transparent, or the filesystem can't do SSI. Pelican, as well as the rest of NOW and the enhancements we've made to Debian, are all -- or will be -- open source.

We'll only be supporting x86 in the first version; we're planning to add support for other major platforms in later releases. Also, I should point out that the beta that's available is for Progeny Debian only, the foundational piece. The filesystem and other SSI pieces are still being implemented and won't be available in beta for some time yet.

As to how multiple-platform support will work, it seems counterintuitive, but a network of PCs is really a heterogeneous network. In an SSI filesystem, files that are different from platform to platform are problematic, unless you have completely identical configurations across the entire network, which clearly isn't realistic. For example, /etc/XF86Config specifies the video driver to be used by X, which may be different from workstation to workstation. So, to have a workable SSI, we have to deal with heterogeneity somehow. To do so, we've adapted an idea from Sprite (see Resources below) called a pathname variable. As the name implies, a pathname variable allows a pathname to include the value of a variable. Take, for example, /etc/XF86Config. If the machine has a Tseng 4000, we want /etc/XF86Config to resolve to a file that works with the Tseng 4000. If the machine has an NVidia, we want /etc/XF86Config to resolve to a file that works with an NVidia. And so on. So, /etc/XF86Config is a symbolic link to /etc/XF86Config.$video, where $video is a variable that contains the type of video card in the workstation. At path traversal time, $video is expanded into the appropriate value, and the path /etc/XF86Config.$video ends up referring to the appropriate file.

Competition

LinuxWorld.com: When I look at Progeny, Ian, I see serious challenges. There are big ideas like application service provision (ASP), Oracle's Internet Filesystem (IFS), and even Microsoft's .Net, all of which promise to consolidate, rationalize, and externalize computational resources to commoditize site management. What can you say about IFS, for example? Is the computational landscape big enough for the two of you?

Ian Murdock: In regard to .Net and similar things, NOW is targeted at local area networks. We're not out to turn the Internet into one huge system and infrastructure for delivering applications and whatnot. We just want to improve how local area networks are managed and used. That's a big market with a huge need.

LinuxWorld.com: What'd you think of the expo in New York City? Is Linux moving forward the way it should? If you were Global Dictator, would you change anything?

Ian Murdock: I wouldn't change a thing. I think industry acceptance is vital to making an impact, and Linux is certainly making an impact. Technology is only useful if it's used.

LinuxWorld.com: What's Progeny's roadmap? What can technologists with an interest in NOW do right now, today, and what milestones are imminent?

Ian Murdock: Linux NOW is a very ambitious effort. It's going to be later this year before the complete NOW system is available, and even then it will be a preliminary release. That being said, Linux NOW is an open source project, and we'll be releasing pieces of the system as they are finished. Progeny Debian is the first piece, and it's available now. The filesystem, which is the centerpiece of NOW, will be available by summer. All the pieces will be integrated with the Debian foundation and, with the network services that we will be rolling out this spring to deliver updates, Progeny Debian users will be ideally positioned to adopt the NOW technology as it becomes available throughout the year.

Wrapping up

LinuxWorld.com: Thank you, Ian, for affording us insight into you and your company. I've bumped into several companies and individuals this winter who seem stuck in complaints about Linux and computing. You appear to concentrate on what you can control, and what you want to make of it. That's refreshing. You've got several of us looking forward to NOW's general availability.

Ian Murdock: My pleasure!

Join the newsletter!

Error: Please check your email address.

More about DebianIFSLogicalMicrosoftNvidiaOracleProgeny Linux SystemsProvisionTMP

Show Comments