Still waiting for swap prefetch

It has been almost two years since LWN covered the swap prefetch patch. This work, done by Con Kolivas, is based on the idea that if a system is idle, and it has pushed user data out to swap, perhaps it should spend a little time speculatively fetching that swapped data back into any free memory that might be sitting around. Then, when some application wants that memory in the future, it will already be available and the time-consuming process of fetching it from disk can be avoided.

The classic use case for this feature is a desktop system which runs memory-intensive daemons (updatedb, say, or a backup process) during the night. Those daemons may shove a lot of useful data to swap, where it will languish until the system's user arrives, coffee in hand, the next morning. Said user's coffee may well grow cold by the time the various open applications have managed to fault in enough memory to function again. Swap prefetch is intended to allow users to enjoy their computers and hot coffee at the same time.

There is a vocal set of users out there who will attest that swap prefetch has made their systems work better. Even so, the swap prefetch patch has languished in the -mm tree for almost all of those two years with no path to the mainline in sight. Con has given up on the patch (and on kernel development in general):

The window for 2.6.23 has now closed and your position on this is clear. I've been supporting this code in -mm for 21 months since 16-Oct-2005 without any obvious decision for this code forwards or backwards.

I am no longer part of your operating system's kernel's world; thus I cannot support this code any longer. Unless someone takes over the code base for swap prefetch you have to assume it is now unmaintained and should delete it.

It is an unfortunate thing when a talented and well-meaning developer runs afoul of the kernel development process and walks away. We cannot afford to lose such people. So it is worth the trouble to try to understand what went wrong.

Problem #1 is that Con chose to work in some of the trickiest parts of the kernel. Swap prefetch is a memory management patch, and those patches always have a long and difficult path into the kernel. It's not just Con who has run into this: Nick Piggin's lockless pagecache patches have been knocking on the door for just as long. The LWN article on Wu Fengguang's adaptive readahead patches appeared at about the same time as the swap prefetch article - and that was after the author had stared at them for weeks trying to work up the courage to write something. Those patches were only merged earlier this month, and, even then, only after many of the features were stripped out. Memory management is not an area for programmers looking for instant gratification.

There is a reason for this. Device drivers either work or they do not, but the virtual memory subsystem behaves a little differently for every workload which is put to it. Tweaking the heuristics which drive memory management is a difficult process; a change which makes one workload run better can, unpredictably, destroy performance somewhere else. And that "somewhere else" might not surface until some large financial institution somewhere tries to deploy a new kernel release. The core kernel maintainers have seen this sort of thing happen often enough to become quite conservative with memory management changes. Without convincing evidence that the change makes things better (or at least does no harm) in all situations, it will be hard to get a significant change merged.

Join the newsletter!

Error: Please check your email address.

More about CFSHIS

Show Comments