Can there be too much of a good thing? That’s certainly true for computer input. Do an Internet search on the term buffer overflow, and you’ll come up with hundreds of thousands of links, most related to security.
In the National Institute of Standards and Technology’s ICAT index of computer vulnerabilities (http://icat.nist.gov), six of the top 10 involve buffer overflows. In 1999, the now-defunct research firm Hurwitz Group named buffer overflow the No. 1 computer vulnerability. Four years later, it’s still a major problem.
If you’ve ever poured a bucket of water into a litre-size pot, you know what overflow means — water spills all around.
Inside a computer, something similar happens if you try to store too much data in a space designed for less. Input normally goes into a temporary storage area, called a buffer, whose length is defined in the program or the operating system.
Ideally, programs check data length and won’t let you input an overlong data string. But most programs assume that data will always fit into the space assigned to it. Operating systems use buffers called stacks, where data is stored temporarily between operations. These, too, can overflow.
When a too-long data string goes into the buffer, any excess is written into the area of memory immediately following that reserved for the buffer — which might be another data storage buffer, a pointer to the next instruction or another program’s output area. Whatever is there is overwritten and destroyed.
That in itself is a problem. Just trashing a piece of data or set of instructions might cause a program or the operating system to crash. But much worse could happen. The extra bits might be interpreted as instructions and executed; they could do almost anything and would execute at the level of privilege (which could be root, the highest level) assigned to that particular memory area.
Buffer overflow results from a well-known, easily understood programming error. If a program doesn’t check for overflow on each character and stop accepting data when its buffer is filled, a potential buffer overflow is waiting to happen. However, such checking has been regarded as unproductive overhead — when computers were less powerful and had less memory, there was some justification for not making such checks. Moore’s Law has removed that excuse, but we’re still running a lot of code written 10 or 20 years ago, even inside current releases of major applications.
Some programming languages are immune to buffer overflow: Perl automatically resizes arrays, and Ada95 detects and prevents buffer overflows. However, C — the most widely used programming language today — has no built-in bounds checking, and C programs often write past the end of a character array.
Also, the standard C library has many functions for copying or appending strings that do no boundary checking. C++ is slightly better but can still create buffer overflows.
Buffer overflow has become one of the preferred attack methods for writers of viruses and Trojan horse programs. Crackers are adept at finding programs where they can overfill buffers and trigger specific actions running under root privilege — say, telling the computer to damage files, change data, disclose sensitive information or create a trapdoor access point.
In July 2000, it was discovered that Microsoft Outlook and Outlook Express let attackers compromise target computers simply by sending e-mail messages. No one even had to open a message; as soon as the user downloaded the message, message-header routines went into action — with unchecked buffers that could overflow and trigger code execution. Microsoft has since created a patch that eliminates the vulnerability. w