Kevin Book, senior director of technology at The Motley Fool, can easily recall the days before he had adequate tools to monitor his company's popular financial Web site, which attracts more than 2 million visitors per month. "Our online store actually ran out of disk [space]," he says. "We were unable to write orders. You can imagine what the guys in ties were thinking when that happened."
Like many IT managers today, Book is charged with keeping a Web infrastructure up and humming. He relies on an assortment of products for snapshots that, taken together, form a more or less complete picture. Some, like the Tivoli Enterprise Console from Tivoli Systems provide trouble alerts and event logs. Others, such as Patrol 2000 from BMC Software monitor major elements such as servers, applications and databases. Additional tools, including SiteScope from Freshwater Software, and the Web-based performance services from Keynote Systems, specifically test page downloads and other Web transactions and compare them to industry benchmarks.
"We're running better, we're running faster, and we're running bigger with the same number of staff," Book says. Still, he adds, "it's difficult to get down to one product." Some tools, such as Tivoli and Patrol, share data, but "it would be nice to have a truly unified view," Book says.
Jeb Bolding, a senior analyst at Enterprise Management Associates (EMA), agrees with that assessment. "It's still difficult to get a complete, end-to-end view," he says. "They all say they've got one, and no one really does."
Book's multitiered, piecemeal approach is common. Mainstream systems management frameworks such as Tivoli, HP OpenView from Hewlett Packard and Unicenter TNG from Computer Associates often serve as both underpinning and umbrella, allowing network managers to keep an eye on all their hardware and applications, of which their Web systems are a subset.
Some IT managers also employ bandwidth-management and traffic-shaping software from companies such as Resonate, as well as "intelligent" network hardware to avoid bottlenecks and redirect resources. For example, Motley Fool has a Big-IP Controller load balancer from Seattle-based F5 Networks that can use input from SiteScope to balance loads among Web servers.
Web-tailored point solutions complement the broader management frameworks, often sharing data with them, says Bolding, who has interviewed companies about their Web infrastructure strategies. "I have yet to run into any enterprise that has a framework product that doesn't [also] have a point product that fills in the hole."
These Web-specific utilities have taken an increasingly application-centric view in recent years. Vendors of automated testing tools for software developers, such as Mercury Interactive, and Segue Software have repositioned their products to take advantage of the e-business market buzz. Application monitoring reached a still finer level of granularity on April 30, when Dirig Software announced Fenway, which it claims is the first component-level management software for application servers. Fenway can purportedly detect failures in Java- and Microsoft-based software objects and components.
Dirig is among a handful of vendors also claiming recently to provide the first comprehensive, application-centric views of e-business infrastructures. Altaworks makes that claim for its new Panorama, also announced April 30.
Taking the holistic systems approach to still another level is adaptive control, a technique similar to an airplane's autopilot that is already used in power plants and digital cameras, according to EMA. In October, it became available in Peakstone's eAssurance product, which measures site activity in real time, compares it against a preset model of service quality and automatically makes necessary adjustments to Web caches, load balancers, servers and databases.
The overarching trend seems to be to build an increasingly detailed data portrait of each of the infrastructure's stress points, then analyze and present it so that IT can get better at provisioning sites. For example, many products perform root-cause analysis, which attempts to statistically correlate events with other conditions that occurred at the same time, providing better clues to the nature and location of problems. But Bolding says the method's success depends heavily on product-specific knowledge bases that haven't yet been adequately developed or detailed.
Other IT managers say they're also using a mix of targeted tools and comprehensive system-management suites.
At Thomson & Thomson, a trademark and copyright services firm, the focus is definitely on application testing. A division of The Thomson Corp., a $2.6 billion publisher in Toronto, T&T in 1997 upgraded its main product, Trademarkscan, which lets users search 16 databases of U.S. patent and trademark filings. T&T put a Web-based graphical user interface (GUI) in front of its awkward Dialog command-line screens and began hosting the databases locally, reducing the number of front-end servers from five to two. The result was the Saegis service. "We've rearchitected the system several times to make it more efficient," says Brian Chase, quality assurance manager at T&T.
Chase and a small support team use Segue's Silk Test and Silk Performer to test new builds of Saegis and monitor its performance after deployment. Silk Test, which compares the values of each page's HTML code against known values, comes in handy for checking the GUI and the accuracy of data loads after a build. The alternative would be writing 500 to 600 short test scripts, Chase says.
Silk Performer logs into the live site every 10 minutes. It's programmed to aggregate the search and billing steps of a simulated "metauser," says Chase. "We use it to determine thread depth and the latency of the way the searches are progressing through the system," he explains. "If the transaction doesn't come back in a certain amount of time, we know we have a lag or a latency." The software spots memory leaks and bad threads and helps staff tweak applications for better performance.
At Acuson the need for new management tools was driven by last November's effort to upgrade the company's "brochureware" Web site so it could support e-commerce, says Rob Shearin, CIO and vice president of IT at the maker of medical ultrasound equipment. A merger with Siemens AG also required tying into the Germany-based electronics company's intranet, says Shearin. "Scalability is important to us," he says of the site, which runs on two Unix boxes hosted by an outside provider. The intranet, which runs internally on Windows NT hardware, started with 2,500 users at Acuson but now is linked to 26,000 others in the medical group and 440,000 Siemens employees worldwide.
Availability, however, wasn't Shearin's main concern during the process of selecting management products last year. Because Acuson's high-ticket products have such long sales cycles, it would be helpful to "separate the buyers from the browsers" to pass along information to salespeople. "I need something that can help me segment and monitor and prioritize and understand my capacity," he says.
Acuson uses Freshwater's SiteScope to view a basic topology of Web servers and traffic-analysis software from WebTrends to analyse visitor habits. But Shearin's staff still needs standard network-management tools to keep an eye on the hardware layer. "We have the [Simple Network Management Protocol] hook into the environment to make sure the boxes are up and available," Shearin says. Acuson does use HP OpenView software, which provides alerts but doesn't provide the real-time performance snapshots that Shearin wants. For that, he uses Peakstone's eAssurance, which underwent proof-of-concept testing in mid-May.
Tools Lack Breadth, Timeliness
Shearin says initial tests of eAssurance showed that the Peakstone product provides not only customer activity reports but also early warnings about capacity problems. "From an investment standpoint, it's been able to tell us, 'have we overbought?' " he says. His staff also likes the product's centralised, comprehensive view. But it's too early to judge the software's ability to control Acuson's infrastructure, and Shearin says he has a wish list of improvements for Peakstone, but he won't identify them. "There are certainly a few bumps in the road," he says, especially in coordinating the new management responsibilities with the hosting provider.
Chase says T&T's Segue combination has proved effective. "We've had almost no downtime in the last year or two," he says. He recounts an incident when the T&T team noticed one server running slowly. "We realized we had messed up a package installation, but we were able to catch it pretty quickly" with Silk Performer, he says. Chase says he likes the repeatability of the automated tests and the fact that it relieves his staff of tedium. But like some other users, he faults the management tools for not keeping up with upgrades of the software they manage. "It takes a while sometimes for [Segue] to get their newest versions done," says Chase. "It got kind of tough to wait for. In the end, what they gave us was really good."
Though he also uses diagnostic tools that come with his Sun servers, Chase acknowledges that T&T's infrastructure management strategy is incomplete and says company network managers are investigating additional management software, perhaps Tivoli. They already use the AppWorx Enterprise Scheduler from AppWorx to monitor uptime of the Saegis servers, which run at hosting provider Digital Island's center.
Why not outsource all the management headaches? Many users are harshly critical of the management tools and support provided by managed service providers (MSP), calling them inadequate and bemoaning the managerial confusion from renegotiating service levels and responsibilities. "At the end of the day, they can't play in this space," says Shearin. "Managed services are not a good value for the dollar," agrees Book. "We have experienced no visible gains." Bolding says MSPs especially appeal to large companies with numerous divisions that can't support their own network management staff, but the users he has talked to paint a similarly unflattering picture. "Typically . . . they've been disappointed with it," he says.
Fenway - starts at US$15,000 to $30,000.
Dirig Software, www.dirig.com.
HP OpenView - starts at US$23,900 for Operations console, $230 per node.
Patrol - separate Predict and Perform versions for Oracle (US$290 and $390 per server, respectively) and Unix ($395 and $875); Storage Resource Manager (starts at $40,000);Service Level Management (starts at US$5,000 plus $195 per managed node; Windows versions start at $815).
Site Angel - starts at US$900 per year.
BMC Software, www.bmc.com.
Peakstone eAssurance - US$48,000 plus $4,800 annually per Web server CPU.
Silk Performer - starts at US$25,000.
Silk Test - starts at US$6,500.
Segue Software, www.segue.com.
SiteScope - US$995 for 25 monitors.
Freshwater Software, www.freshwatersoftware.com.
Tivoli Enterprise Console - approximately US$300 per node.
Tivoli Systems, www.tivoli.com.
Unicenter TNG - starts at US$2,500.
Computer Associates, www.ca.com.