Linux cannot be said to suffer from a shortage of virtualization solutions. What is harder to come by, however, is a paravirtualization system which is amenable to relatively easy understanding. A relatively recent entrant into the field, however, changes that situation significantly. With just 6,000 lines (including the user-space code), Rusty Russell's hypervisor implementation, lguest (also known as the "Rustyvisor"), provides a full, if spartan paravirtualization mechanism for Linux.
The core of lguest is the lg loadable module. At initialization time, this module allocates a chunk of memory and maps it into the kernel's address space just above the vmalloc area -- at the top, in other words. A small hypervisor is loaded into this area; it's a bit of assembly code which mainly concerns itself with switching between the kernel and the virtualized guest. Switching involves playing with the page tables -- what looks like virtual memory to the host kernel is physical memory to the guest -- and managing register contents.
The hypervisor will be present in the guest systems' virtual address spaces as well. Allowing a guest to modify the hypervisor would be bad news, however, as that would enable the guest to escape its virtual sandbox. Since the guest kernel will run in ring 1, normal i386 page protection won't keep it from messing with the hypervisor code. So, instead, the venerable segmentation mechanism is used to keep that code out of reach.
The lg module also implements the basics for a virtualized I/O subsystem. At the lowest level, there is a "DMA" mechanism which really just copies memory between buffers. A DMA buffer can be bound to a given address; an attempt to perform DMA to that address then copies the memory into the buffer. The DMA areas can be in memory which is shared between guests, in which case the data will be copied from one guest to another and the receiving guest will get an interrupt; this is how inter-guest networking is implemented. If no shared DMA area is found, DMA transfers are, instead, referred to the user-space hypervisor (described below) for execution. Simple disk and console drivers exist as well.
Finally, the lg module implements a controlling interface accessed via /proc/lguest -- a feature which might just have to be changed before lguest goes into the mainline. The user-space hypervisor creates a guest by writing an "initialize" command to this file, specifying the memory range to use, where to find the kernel, etc. This interface can also be used to receive and execute DMA operations and send interrupts to the guest system. Interestingly, the way to actually cause the guest to run is to read from the control file; execution will continue until the guest blocks on something requiring user-space attention.