Your server is wasting your CPU

All AMD server CPUs leave the factory tuned to perfection. Then system and OS makers screw them up

While using an AMD Barcelona (quad-core Opteron) server to create a portable benchmarking kit for InfoWorld's Test Center, I discovered something unexpected: I could incur variances in some benchmark tests ranging from 10 to 60 per cent through combined manipulation of the server's BIOS settings, BIOS version, compiler flags, and OS release.

I attempted to document the impact that each individual change had on performance, but flipping one switch often changed the effect of all the others. This frustrating yet fascinating effort ran aground when I had fiddled with settings and flags so much that my testbed stabilized; I could no longer budge the results no matter what I did. Normally stability is a good thing, but in this case, I was more interested in investigating the variance than in eliminating it. If I tried to bring this case to an actual engineer, he or she would likely tell me it had all been a mirage.

Last week, I had an opportunity to put this matter to a whole panel of engineers, the brain trust that manages performance and power testing for AMD's server CPUs. I was told that I was seeing an effect that's widely known among CPU engineers, but seldom communicated to IT. The performance envelope of a CPU and chipset is cast in silicon, but sculpted in software. Long before you lay hands on a server, BIOS and OS engineers have reshaped its finely tuned logic in code, sometimes with the real intent of making it faster or more efficient in some way that AMD hadn't considered, sometimes to compensate for overall server design flaws, and sometimes to homogenize the server to flatten its performance relative to Intel's.

Perhaps there have been cases where AMD servers were made more powerful or efficient through software tuning that deviates from AMD's advice. I sincerely doubt it. Most times, trying to outthink AMD's engineers is a fruitless exercise, but system and OS makers do it all the time. When they get it wrong -- and this is far easier to do than getting it right -- it costs you. You end up with systems that aren't performing to their potential, are letting power efficiency features go unexploited, or both.

AMD has performance engineering teams devoted to the science of optimization. Before a single system is built using any new family or major revision of AMD64 microprocessor, AMD issues detailed documentation listing each CPU's capabilities and tunable parameters. New CPUs and AMD-built chipsets go out with reference BIOS code that puts the processor in an optimal state before booting the OS. I met the engineers that develop the guidance for system makers, BIOS vendors, and the OS development teams at Microsoft, Red Hat, Suse, and Sun. They're no amateurs; I'd trust their advice. But once they do their jobs, the tuning of each system sold with AMD CPUs is out of their hands. The tiller is turned over to software.

The BIOS gets in there first. Machine code in the BIOS walks through the CPU's parameters and initializes them based on some combination of AMD's advice, the system administrator's preferences as expressed through configuration settings, and the whims of the system maker. Manufacturers contract for the development of the BIOS firmware in their servers. They determine what an admin can adjust, as well as the settings of all the things you can neither see nor change.

You'll never find a server shipped from the vendor with overly aggressive settings. Systems may be tuned downward to operate at the widest possible range of temperatures, to accommodate cheaper components, or to throttle performance in order to compensate for inadequate cooling design. Tuning can also undercut system performance in misguided attempts to meet energy efficiency targets, when that objective could be just as well served without sacrificing performance. For example, slowing memory access can cool the system considerably and lower its power draw, but the cap on performance means that tasks take longer to complete, increasing the time that the system spends drawing maximum power.

The OS also presents an interesting issue, especially with Windows. I was surprised to learn that starting with Vista, processor drivers, critical to controlling power states, are being written exclusively by Microsoft. AMD knows a thing or two that Microsoft doesn't about tweaking an AMD64 CPU for speed and efficiency. Microsoft wants to handle this on its own, and tuning within Windows Vista and Server 2008 does not take the unique characteristics and advantages of AMD's architecture into account.

Join the Computerworld Australia group on Linkedin. The group is open to IT Directors, IT Managers, Infrastructure Managers, Network Managers, Security Managers, Communications Managers.

More about: AMD, Intel, Microsoft, Parametric, Red Hat, Socket, Speed, SuSE
Comments are now closed.
Related Whitepapers
Latest Stories
Community Comments
Whitepapers
All whitepapers

Amazon vs. Google vs. Windows Azure: Cloud computing speed showdown

READ THIS ARTICLE
DO NOT SHOW THIS BOX AGAIN [ x ]
Sign up now to get free exclusive access to reports, research and invitation only events.

Computerworld newsletter

Join the most dedicated community for IT managers, leaders and professionals in Australia