It was an online retailer's worst nightmare. Most customers of the Louisville Slugger Museum's online gift shop arrived by way of Hillerich & Bradsby Co.'s home page for its famous bats, www.slugger.com. But last July, someone uploaded the wrong file to the company's Web site, and for six weeks, all attempts to go to the gift shop from the Louisville Slugger home page were redirected to another company's Web site
The glitch occurred in mid-July. End-of-month reports showed a reduced activity level, but the company attributed that to other causes. Another month passed before it became clear that the online shop was getting no home-page referrals. "When we ran the August reports in September, we realized that www.slugger.com didn't send us any referrals," says Christopher Caudhill, a developer at Hillerich & Bradsby. "We knew we had a problem then."
Caudhill needed something that would alert him to problems faster and eventually decided to use IntegriTea from TeaLeaf Technology Inc. in San Francisco. The software tracks Web site activity and sends an e-mail alert when traffic falls below a set threshold. It also captures and logs end-user sessions, compares them to established norms for response times and other criteria, and reports system errors and user errors such as log-in failures.
IntegriTea is part of a growing family of tools that help Web site managers, systems administrators, application developers, help desks and database administrators design, develop, test, monitor, troubleshoot and tune Web-based systems. And Hillerich & Bradsby isn't the only company that sees a need. According to a study last year by research firm Newport Group Inc. in South Yarmouth, Mass., the number of companies that continuously monitor systems performance rose from just 6 percent in 1999 to 53 percent last year.
Performance monitoring and tuning software originated in the mainframe world decades ago. More recently, tools have sprung up specifically for today's multitier, highly distributed e-commerce systems. The most comprehensive of these monitor performance and information flows at three levels: the Web server front end, the midtier application server and the database back end.
Some also go outside the firewall, using geographically distributed software agents on remote client machines to look through the eyes of end users and report problems with local Internet service providers and third parties such as credit card authorization systems. These can be set up either to simply monitor page-access times or to run elaborate scripts to see, for example, how long it takes to rent a car online.
Users such as Caudhill praise these tools for their ability to quickly identify -- and to some extent diagnose -- problems, from slow responses and system crashes to network outages and user errors. But they caution that the software generally isn't cheap or easy to use and is no substitute for having experienced systems people on staff.
Agents Are Watching
Lastminute.com PLC's Web sites receive some 50 million page views per month. The London-based seller of discount travel and entertainment tickets uses subscription services from Gomez Inc. in Waltham, Mass., to watch the performance of its three-tier architecture in two U.K. data centers. Gomez also monitors the telecommunications services that bring in Lastminute.com users from around the world and the third-party services, such as flight reservation systems, for which Lastminute.com acts as a portal.
Gomez has software agents on PCs in the 12 countries where Lastminute.com does business. At specified intervals -- every 15 minutes, say -- agents at 20 locations poll a half-dozen or so Lastminute.com Web addresses to test for end-to-end response time.
Lastminute.com also uses a "transactional service" from Gomez that automatically executes a script that mirrors an entire user session. For example, it runs through a hotel booking once per hour, says Brendon Cowell, global technical operations manager at Lastminute.com.
"With the reports you get back from Gomez, you can drill down and isolate faults -- at the network level, application level or whatever," he says. "Without something like this, identifying the source of the problem would be very, very tricky."
Every 2.5 minutes, software agents maintained by Keynote Systems Inc. go shopping at REI.com and REI-outlet.com, the online stores for Recreational Equipment Inc. in Kent, Wash. Located around the country, the agents execute a five-page script that includes searching for a product, putting it in a shopping basket, entering address and credit card information and so on.
"If they get two failures in five minutes, it beeps me," says Rod Ketchum, system architect at REI. "And they record that and give us the statistics. That's really hard to duplicate with in-house scripts."
In addition to spotlighting problems, San Mateo, Calif.-based Keynote allows REI to do ad hoc performance testing. "I can put in a URL -- any page at any site -- and it will tell me exactly how long each part of the page takes," Ketchum says. In doing that, Keynote acts as a development tool, allowing REI to test the effects of Web page changes before they're put into production.
Getting All the Numbers
Ketchum points out that external, agent-based performance monitors may spotlight problems that aren't under the direct control of REI, such as an Internet provider whose service is down or slow. But the performance summaries can be useful when negotiating with Internet providers, he says.
"Having the backup numbers is very handy," Ketchum says. "It's like when you go to buy a car and you have all the prices printed out. You can say, ëYou can't fool us anymore.' "
But the benefits go beyond simple performance summaries. Testing the response time of a Web page is useful, but it's more important to have a monitoring tool that can simulate an entire user session, says Tim Talbot, CIO at PHH Arval. The Hunt Valley, Md.-based subsidiary of Cendant Corp. manages fleets of vehicles for corporate clients and has several Web sites, including one for customers. PHH uses the LoadRunner testing tool and the Topaz performance monitor from Mercury Interactive Corp. PHH worked with Sunnyvale, Calif.-based Mercury to write more than 50 scripts to capture "the user experience," Talbot says.
The performance goals used by these products must be carefully chosen to match the expectations of users as well as processing realities, Talbot says. "It might take one to two minutes to run a report," he explains, "but the same report used to take two to three weeks to receive. It might be going through millions of gas-receipt records, so a two-minute response may be very acceptable."
PHH supplements the Mercury products with more platform-specific tools that can drill down to a finer level of detail. For example, if Topaz says a Windows NT server is slow, PHH uses a product from NetIQ Corp. in San Jose to dig out the details. Talbot says PHH could have integrated all of its performance tools but decided that doing so wouldn't deliver enough value to justify the effort.
Despite the help they give, performance tools "are not simple, and they are not cheap," Talbot says. "You can't just say, ëOh, a couple of weeks to write scripts and we are done.' You have to involve the business to understand what's truly important to them. We've been evolving this for years.
"There are other tools that are much simpler, and they are priced accordingly," Talbot adds. "But you get what you pay for."
Monitoring on the Cheap
Patrick Killelea, technical director at a major brokerage house and author of the book Web Performance Tuning, 2nd Edition (O'Reilly & Associates, 2002), is lukewarm about most commercial Web system optimization tools, which he says are expensive, lock users into proprietary formats and don't adapt readily to environments that change frequently. Instead, he advises IT shops to roll out their own, using freeware tools and languages such as Perl. "You can do almost everything the vendors do with free software," he claims.
Killelea and his staff have used Perl and Gnuplot, a free plotting program, to create monitors for the brokerage's online trading systems.
Killelea says companies that want to take this route will need 12 free programs to create their Web performance monitors, including Perl (www.perl.com) and the GNU Compiler Collection (www.gnu.org/software/gcc/gcc.html). He offers download links for Gnuplot and nine other programs needed to monitor Web pages with Perl at http://patrick.net/software.
Commercial tools are easier to use initially, Killelea concedes. "Open source is harder to get started with because you have to know more, but in the long run you have far more power because you can look at the source (code) and know exactly what's going on in the tool and change it to do what you want," he says.
"There is immaturity in the tools, and they can be difficult to work with," agrees Corey Ferengul, an analyst at Meta Group Inc. in Cambridge, Mass. But he cautions against creating homegrown management tools, which he says require an internal programming group to develop and support them. "These are not resources most companies have," he says.