Banish Bottlenecks

FRAMINGHAM (04/10/2000) - Using an e-commerce site is much like driving a car on a highway. The speed at which a request gets from Point A to Point B depends on the traffic it encounters and the stops it must make.

As Web sites gain interaction capabilities, they must also interoperate with more potential bottlenecks. Firewalls and switches must check out requests before releasing them to the Web server. Application servers must first talk to back-end databases to determine a final destination. Payment information must be verified by a third-party credit-card service. Each operation acts like a tollbooth on a highway, slowing down the speed at which the request can be made. And since bandwidth is a finite resource, each request finds itself competing for space in an increasingly crowded pipe.

Slow performance isn't an option for an e-commerce Web site, however. Customers want faster processing and quicker navigation even as they're demanding more sophisticated features. Unfortunately, the most obvious solutions to the problem of slow commerce sites - more bandwidth, more boxes, more servers - aren't always the most appropriate answers to the performance question.

Finding solutions to bottlenecks requires a review of every part of the e-commerce architecture. In most cases, the network needs some combination of caching, clustering and load balancing.

Caching In

One way to speed performance and reduce traffic is to store, or cache, seldom-updated content. This can mean pushing content to the network edge, where it doesn't travel through as many switches or servers.

When The Motley Fool Inc. began dispensing investment advice in 1995, static HTML pages dished up much of the information. "Back then, we didn't need a database back end. Now it's the heart of our system," explains Dwight Gibbs, "chief techie geek" at the Alexandria, Virginia-based firm.

Today, The Motley Fool Web site, www.fool.com, is much more interactive. Users can access message boards, participate in discussions and listen in on live conferences. Introducing more and more complexity inevitably degrades performance. "Nothing is ever going to be as fast as static HTML," says Gibbs, "so the question is whether hardware can keep up."

Today, Fool.com uses a series of Compaq Computer Corp. machines running Windows NT, Microsoft Corp.'s Internet Information Server (IIS) and transaction servers. F5 Networks Inc.'s Big-IP routes traffic among machines.

Using Akamai Technologies Inc.'s content delivery service last year, Fool.com achieved great performance gains. Cambridge, Massachusetts-bassed Akamai's proprietary technologies reside on a global network of 2,000 servers. The Akamai system routes each request for content to servers that are geographically closer to the customer so the request travels a shorter distance along the Internet.

By off-loading requests for specific graphics-intensive pages to the Akamai network, Fool.com fields fewer requests through its servers, firewalls and switches. Existing bandwidth gets used more efficiently. Gibbs claims that Fool.com realizes 30 percent to 50 percent gains in performance by using Akamai's service.

Although it can improve delivery speed, "caching does not affect transaction performance," says Walt Smith, chief engineer at Atlanta-based e-commerce consulting firm iXL Inc. But delivering multimedia, although difficult, is just one bottleneck along the way.

Balancing the Load

Craig Johnson, senior wide-area network engineer at DiscoverMusic.com, similarly concluded that transaction performance was most critical. His Seattle-based company delivers via the Internet music samples heard on the best-known CD commerce sites. Interruptions mean lost revenue.

DiscoverMusic.com first began in 1996 as Enslo Audio Imaging International, a subsidiary of Muzak LLC. The decision to spin off the division as a separate publicly traded company coincided with the need to improve the transaction performance of the site. As investors began evaluating the company, they raised a significant concern: If you base revenue on service being available, what is your disaster-recovery plan?

Back then, the service, hosted by MCI WorldCom Inc. and running on just two Intergraph Corp. servers, couldn't quickly recover from an interruption.

Lengthy delays occurred when the second server picked up a request. And at that time, DiscoverMusic.com was delivering fewer than 2 million streams of music per month.

To provide continuous disaster-recovery service, DiscoverMusic.com built server farms in Seattle and Herndon, Virginia, with full redundancy in power, services, hardware and software, all capable of responding to the same request.

To assure transaction performance, Johnson needed a product that could reliably handle the streaming audio requests and route each request to the appropriate server.

He chose F5 Networks' Big-IP and 3DNS products. These run as network appliances, minimizing impact on the network hardware and software infrastructure. During the initial tests, "we were high-fiving each other," Johnson recalls. DiscoverMusic.com found that it could interrupt the audio stream and have a second server pick up the request without missing a beat. The company now serves 60 million to 70 million audio streams per month and has experienced no interruptions of service since initiating the new setup in November 1998.

Routing requests to the least-busy server won't solve performance problems at all e-commerce sites, however. Given the varying levels of computing power in machines that support e-commerce applications and the types of features that must be supported, certain sites employ more complex load-balancing solutions.

"The problem with traditional load balancing is its round-robin approach," says John Puckett, CIO at Waltham, Massachusetts-based toy retailer Toysmart.com Inc. In a traditional load-balancing system, if a server configuration had a new PC and an old Cray machine, the least-busy server would get the request.

Problems occur when the older server gets a request it can't handle.

Toysmart.com doesn't run on Cray computers, but its configuration does include Sun Microsystems Inc. Solaris machines and a multitude of Compaq servers of varying computing power running Windows NT. To make the most efficient use of its existing server configuration, Toysmart.com implemented a switch from ArrowPoint Communications Inc. in Acton, Massachusetts. It uses a rules-based engine to determine where a user request should be sent.

"Most people think about the appliances and connections more than the network and bandwidth," says Puckett. By caching graphics at the network, tuning the local cache of the server boxes and balancing server loads, Toysmart.com uses its existing bandwidth more efficiently. With these improvements, the company doubled its capacity in terms of the number of pages its site can supply per minute.

Load balancing can happen at the server level as well as the switch communications level, says George Dodson, a member of the board of directors of the Computer Measurement Group Inc., an independent nonprofit organization in Turnersville, New Jersey.

Called clustering, this technique of grouping independent servers to work as a single system can often improve overall site performance.

The Motley Fool and Toysmart.com both use Microsoft Cluster Server, which is bundled in the enterprise edition of Windows NT. This connects two servers so one can take over for the other in case of failure. Whereas load balancing increases performance, this type of clustering improves a site's reliability more than its speed.

Load balancing within the server, or clustering, made a difference at ETrade Group Inc. Begun in 1982 as the first online brokerage service, ETrade implemented its e-commerce service in 1996. Back then, it managed 60,000 accounts. Today, it manages 1.5 million.

According to Gary Kattge, director of quality assurance at ETrade, customers look for reliability and speed. Given the constant introduction of new content and features, that's no small feat.

Caching of content and clustering of back-end databases have yielded the greatest performance improvements at ETrade. Its clustering technique puts more than one server into the same box, increasing the speed at which one server can take over for another. ETrade's system focuses on routing a request to the server best suited to handle it - something Kattge calls "services packaging."

It looks at what features are most often used and guides those requests toward higher-capacity servers tuned to handle them.

Consider ETrade's "smart alerts," which notify an account holder when an event, such as a stock hitting a certain price, occurs. "These represent a huge flow of data," says Kattge. "We set aside certain parts of the system tuned to handle these specific requests. We don't distribute the load evenly."

Working the Back End

Still other companies find that their performance bottlenecks occur once the application server tries to communicate with the back-end databases. Database caching techniques significantly helped out customers of Interlink Communication Systems Inc.'s (ICS) sites.

Clearwater, Florida-based ICS resells data communication hardware to other businesses. It started an e-commerce site, Interlinkweb.com, two years ago and has introduced stores targeted at specific customers.

ICS found that its customers were mainly interested in getting reports on their account status. The company's site simplifies the purchasing procedures and order-checking processes for corporations. It allows corporate procurement officers to name other buyers and set credit limits. Because 90 percent of their purchases are made through purchase orders, ICS's customers must know where their accounts stand.

To do this, ICS links its Web-site ordering with the Dynamics accounting system from Great Plains Software Inc. "Our goal is to maximize the customer experience with as much functionality as possible without adding inherent in efficiencies," says Paul Dietrich, ICS's chief technology officer. For a reporting-intensive site, this is a challenge, he says.

Although ICS is a Microsoft shop that runs IIS, Site Server and Commerce Server, it recently abandoned Microsoft ActiveX components for communicating with the back-end databases. Using these map query objects meant repeating certain executions that slowed the generation of reports.

ICS now relies on stored procedures, using SQL 7.0, for handling the reporting-intensive features. Stored procedures are like caches for database requests. If a report requires five steps, the stored procedure has the first three already completed. Dietrich says this change increased reporting execution speed by a factor of three to five.

In essence, an e-commerce site must scale with the number of users, respond reliably and quickly, and optimize its use of bandwidth. When it comes to performance, caching, clustering and load balancing, applied at the appropriate junctures, make these three goals more attainable.

Shand is a freelance writer in Somerville, Mass.

Proflowers.com

The problem: Keep Proflowers.com's online florist site from wilting under a 28-fold traffic increase from 30,000 visitors last February.

The infrastructure: Four Web servers running Windows NT and Microsoft's IIS 4.0.

The solution: Add servers and increase bandwidth capacity using CacheFlow Inc.'s network appliance and put the most graphics-heavy pages in Akamai's caching service network.

The results: Performance has been boosted by up to 400 percent overall. The main page's maximum download time of 50 seconds has been reduced to a very acceptable 15 seconds.

Started in San Diego in April 1998, Proflowers.com lets its users purchase flower arrangements directly from the grower. It recently merged with Flowerfarm.com, enabling the combined entity to provide flowers to customers around the world.

When Yoshio Kurtz, director of development, joined the company in May 1999, this Allaire Corp. ColdFusion site ran on four NT/IIS Web servers. But users with slow modems could wait 50 seconds for the site's graphics-intensive main page to download during maximum traffic periods. Today, it takes only 15 seconds in the same situation, thanks to the site's new advanced caching techniques and infrastructure improvements.

Proflowers.com uses two complementary caching services from Akamai and CacheFlow to get these significant improvements. Proflowers.com selected 10 of its most graphics-intensive pages, representing 90 percent of its total traffic, and placed them on the Akamai network.

Kurtz estimates that in its busy month of February, Proflowers.com paid Akamai $2,000 to $3,000. Proflowers.com installed the CacheFlow 3000 network appliance on its site in San Diego to cache all other graphics. Ninety-two percent of the site's content is cached outside the first network switch.

Kurtz's team also scaled the site's infrastructure horizontally by adding about 35 additional machines and servers, and vertically by adding more RAM and CPU power. The team also tuned the local cache of the Microsoft IIS to maximize its performance.

"I know of companies who have better equipment but don't get the performance that we do because they don't know how to tune these boxes," Kurtz says.

Performance has increased anywhere from 200 percent to 400 percent since making this combination of improvements, he says.

Join the newsletter!

Error: Please check your email address.

More about Akamai TechnologiesAkamai TechnologiesAllaireArrowPoint CommunicationsCacheFlowCacheFlowCompaqCompaqCrayETRADE AustraliaF5 NetworksGreat PlainsIntergraph CorpInterlinkiXLMCIMCI WorldComMicrosoftMotley FoolNetAppSun MicrosystemsWorldCom

Show Comments