Legal threats may be the high-profile risk for Linux, but the popular open source kernel project is coming face-to-face with key technical shortcomings, too. As the Linux Foundation plans its first Collaboration Summit for June 13 through 15 at the Google campus in Mountain View, Linux contributors are speaking out about kernel gaps that have no solution readily in sight.
Andrew Morton, a kernel developer best known for filtering and testing new kernel submissions in a test kernel called the "-mm tree", listed three major problem areas in a May "State of the Kernel" talk at Google: the file system, power management, and instrumentation. The file system, one of the areas of kernel development requiring the heaviest computer science work, is software that determines how to place and index data on disk or other nonvolatile storage -- and Linux's file systems are falling behind the demands of large storage users.
In an e-mail message, project founder Linus Torvalds says he agrees that the file system and power management need to work. The latter, he says, is part of a bigger problem with device drivers that basically work but don't implement advanced features. But, Torvalds says, the simple instrumentation Linux already has is enough to deal with real-world performance issues.
Torvalds adds what he calls the "development flow" to the list of concerns. "We've always had issues with how certain subsystems end up having development problems due to some infrastructure issue, or just personality clashes, or some methodology being broken. Sometimes it's not 'subsystems,' but hits at an even higher level, and we end up having to change how we do things in general," he says.
Linux has already been through major shifts in development process. With the release of the 2.6.0 kernel in December 2003, Torvalds and other developers stopped maintaining separate stable and research development trees for the kernel, although a new line of stable kernel releases began again with the 184.108.40.206 kernel in 2005.
"We have a lot of good file systems, and I think most people are happy with them. But I think we could do better," Torvalds says. Morton was blunter in his talk, saying, "Basically, I think we need a new file system."
File-system developer Val Henson points out that disk capacities are likely to grow by a factor of 16 by 2013, but that bandwidth will only grow by a factor of 5, and seek time by a factor of 1.2. That means that the file-system-checking utility, fsck, will take longer and longer to run. "Fsck on multiterabyte file systems today can easily take 2 days, and in the future it will take even longer! Second, the increasing number of I/O errors means that fsck is going to happen a lot more often -- and journaling won't help," Henson wrote.
Torvalds points out that the standard Linux file system, ext3, does some wasted work on ordinary-sized disks, too. "Ext3 is ubiquitous, but actually doesn't do very well on some 'simple' cases like fsync, where it ends up flushing basically the whole journal, even if we just want to sync a single file," Torvalds says.
The "fsync" system call is required only to write to disk the data associated with a single file. There is a separate "sync" system call to flush all buffered data to disk. On a busy server, flushing extra journal data would slow down an application that was trying to "fsync" just one file.
Sun's ZFS is the hot new file system and blends what on Linux are separate RAID, logical volume management and file-system layers into a single subsystem. ZFS, however, is under the open source, but GPL-incompatible, CDDL license and so isn't available to become part of Linux directly.
Implementing a ZFS-compatible file system, even with access to the source code, may not be an option because of Sun's patents. Torvalds adds that Network Appliance has patents on some other useful file-system techniques.
Developer Ricardo Correia has devised a way around the licensing beef, with a system for running Sun's own ZFS code in user space, employing the FUSE technology. Although the project is much slower than a conventional in-kernel file system, Correia claims that "FUSE-based file systems can have comparable performance to kernel file systems, as the bottleneck is usually the disk(s), not the CPU." There is precedent for handing off a key system role to a specially privileged user-space program -- the X Window System used on Linux has always run as a user-space application. One developer has even built an experimental Linux distribution that integrates ZFS on FUSE into the installer, but the approach doesn't appear to have seen a severe test yet.
In his own "kernel report" talk at the Embedded Linux Symposium, kernel developer and author Jonathan Corbet points out two other file-system research efforts, neither of which is likely to make the mainstream soon. Ext4 is a research extension of ext3, and the process of adding Reiser4 to the kernel was stalled even before inventor Hans Reiser's arrest in California last October. (Kernel updates syndicated from Corbet's Web site, LWN, appear on LinuxWorld.com as the weekly Kernel Space feature.)
While current file systems are designed around the disks, Torvalds suggests that in the future the assumptions may need to change. "Another interesting issue is that I don't think that the day when we'll see more solid-state storage is a pipe dream any more," he says. "Sure, some people want terabyte hard disks, but for others, the multihundred GB disks just lay mostly unused. And flash storage has a totally different performance profile, which means that things that we do for traditional disk reasons, may not actually make as much sense if you actually aimed for a flash disk in the first place."