Shop.com tries to diagnose Web site problems

How Shop.com holds up under the strain of the holiday shopping season

With the holiday shopping season in full swing, Shop.com wants its Web site running as smoothly and quickly as possible. An hour of slow service could cost the Californian company hundreds of thousands of dollars, company officials say.

The company uses four or five monitoring systems to detect problems that could slow down the Web site and drive customers away, but Shop.com executives say one in particular helps them avoid the clutter of false warnings, diagnose problems before they impact service to customers, and operate an IT department with a small staff.

The system, developed by ProactiveNet , collects data to learn what the normal behaviour of the company's IT infrastructure is at any given time of the day or week. This allows it to discover when CPU usage is abnormally high or low.

Most programs "tell you if your server is up or down," says Geoff Caras, vice president of system infrastructure at Shop.com. "ProactiveNet might say your system's up but your system is being used very heavily ... It's a little more intelligent about how it alerts and what it tells you."

A few companies make programs that work the same way as ProactiveNet's, but most on the market are less sophisticated, Caras says.

A typical program might issue an alert when CPU usage goes above a certain threshold, say 85 percent.

This leads to irrelevant warnings, because during busy shopping times a high usage rate may be normal, says David Langlais, vice president of marketing for ProactiveNet. At the same time, a system that reacts only to an 85 percent threshold could miss problems that happen during slow shopping times.

"If (CPU usage) is normally at 20 percent at 6 in the morning and it's at 70 percent at 6 in the morning, you're already behind the curve," Langlais says.

Shop.com says it processes nearly 500 million transactions a year, connecting customers with more than 1,600 merchants.

Shop.com has used ProactiveNet for about two-and-a-half years, and just a couple months ago wrote new procedures that identify blockages caused when there are multiple requests for the same data. These requests can be due to customer orders, or anything else happening on the site.

Just as a fallen tree can block a stream, multiple requests for the same data can back up servers, slow down a Web site and drive customers away.

"Anything that blocks in the database is a really big deal," Caras says. "In the absolutely worst case, if we had serious issues it can take the entire site down."

Because slowness had been a problem, Shop.com implemented the new process in time for this year's holiday shopping season.

"We did get alerts on this before, but we got them when things were pretty blocked up, much farther upstream. It had a bigger impact," he says.

Besides solving technical problems, Caras says ProactiveNet identifies areas of its Web site that are drawing lots of customers and should be bolstered with an extra server to expand network bandwidth.

The ability of ProactiveNet to collect data about the proper behaviour of Shop.com's computer systems and diagnose problems early has allowed the company to keep its IT staff relatively small, Caras says.

"We don't have 20 people in IT, we have eight. When you look at a large site, there's typically a whole lot more people managing IT," he says.

Shop.com uses several monitoring systems in addition to ProactiveNet, but they send out so many alerts it is hard to tell which are due to real problems.

"They give us thousands of alerts per day," Caras says. "Climbing through that is a significant task."

Join the Computerworld newsletter!

Error: Please check your email address.

More about Proactivenet

Comments

Comments are now closed

Amazon vs. Google vs. Windows Azure: Cloud computing speed showdown

READ THIS ARTICLE
DO NOT SHOW THIS BOX AGAIN [ x ]