Search Engines Can Take On Complex Functions

FRAMINGHAM (07/03/2000) - How many search engines do you use? If you're like Paul Gellman, MIS director at Markon Pen and Pencil Inc., a Mineola, New York-based advertising specialty supplier, you use a few to find the inventory you want to buy, a few more to find where to unload excess inventory and a whole lot more for just surfing the Web and finding that perfect low-priced airline fare.

To a seasoned Web surfer, using different search engines for different tasks is easy. "Once you become experienced with searching, the different models are pretty easy to understand," says Gellman.

But try flipping the paradigm - choosing just one search engine to install on your site for customers - and things get a lot harder to understand. Choose carefully: Finding the right search engine to let customers search your online store or to allow employees to unearth best practices on your intranet can mean revenue life or death. Usability researcher Jakob Nielsen, a principal of Silicon Valley's Nielsen Norman Group, reports that more than half of all users are "search dominant": When faced with a choice of using a search box or clicking a link, they use the search box.

The days when sites could rely upon search engines that matched the user's query with every possible response are long gone. Today, users expect not only fast search engines, but also well-ordered results with a lot of context.

The fate of users who trust a corporate or e-commerce site's search technology rests squarely on the shoulders of the information technology department and the Web site's developers - and the technology they choose. There are three options for getting search technology: buying it, building it in-house or outsourcing it. Your choice will depend upon whether the engine is for a business-to-consumer e-commerce site or a corporate intranet.

Site Search for Consumers

Search engines on the Web might seem simple: users input a query, the search engine dissects the keywords and checks its databases - mirror copies of every Web page it has ever found - and returns links to the most likely Web pages.

But according to a recent NEC Research Institute and Inktomi Corp. research study, there are more than 1 billion indexable pages on the Web today, while current search engines index only about half of them.

Still, that's 500 million Web pages. So using that same search technology to index and retrieve a few thousand products should be easy - right?

Unfortunately, searching a site is a different ball game than searching for Web pages. When shopping for products, people aren't willing to churn through dozens of pages of search results. They want accurate, well-presented results, and they want them fast.

And as you add more products, you might need to add more search technology, too. That's what Marc Raygoza, director of Web development at Buy.com Inc. in Aliso Viego, Calif., found: As Buy.com added more kinds of products to its inventory, it also needed to add new search technology to be able to accurately and quickly index them. Buy.com, which opened in November 1997 to sell computers and software, doesn't sell just a few thousand products anymore. It has more than 1.5 million products in an array of categories such as books, videos and software games.

Buy.com was changing the prices on many products and adding inventory daily, and it needed a search engine that wouldn't buckle under the weight of rebuilding its index every night. The company had based its search function on Microsoft SQL Server Version 6.5, which couldn't perform full-text searching or rebuild the index quickly enough.

"SQL Server is a good product, but it just doesn't cut it for millions of SKUs," says Raygoza. "We needed something more robust to do our updating."

So Raygoza went looking for a new search engine, evaluating products such as AltaVista Co.'s Search Engine 3.0 SDK and an array of products and services from Inktomi.

"We took Inktomi out of the running fairly early because their model is basically outsourcing, and we didn't want to outsource the search function completely," says Raygoza.

Buy.com chose AltaVista because its software provided much faster updating than SQL Server and the company had a strong reputation and "fit well into our model of commodity selling," which is based on high turnover of merchandise, says Raygoza.

Update Speed Is Crucial

The key, says Raygoza, is how quickly the search engine rebuilds the search index each night. Buy.com also upgraded to SQL 7 and still uses it for log-in functions. Before it made its final decision, Buy.com tested the updating capacity of the AltaVista software and found it took just 42 minutes to rebuild the entire index, compared with two days for SQL Server, according to Raygoza.

What's more, he adds, the AltaVista engine was already being used by competitors Amazon.com Inc. and Borders Group Inc.'s Borders.com site.

But don't confuse the e-commerce version of AltaVista with its free Web page searching sibling. The core search technology is the same, having been developed on the main site, says Rajiv Parikh, AltaVista's director of product marketing for search and business solutions. But for corporate customers such as Buy.com, AltaVista Search Engine 3.0 adds ranking and relevancy functions that are customized for the business and can be modified as business or market conditions change. AltaVista's corporate search tools also allow users to search other information sources besides the Web, such as their inventory databases, says Parikh.

Staples Inc., meanwhile, was looking for maximum flexibility for the user to go along with search features that could be customized to business needs.

Tom House, project manager of the newly revamped Staples.com Web site, recently put the browser engine idea into practice.

Instead of sticking with search engine technology in which programs look for keywords in Web documents or files, much in the way that a telephone operator uses a name and address to match up to a number, browsers give the user free rein to jump from hyperlink to hyperlink on the Web.

"With over 130,000 different office supplies sold on the site, we had to offer more than just a static list; users needed to be able to drill down into multiple categories," says House. "The time had come for search to meet browser."

The core of the Staples.com search engine is based on Microsoft Site Server search engine technology - in which software is used to build indexes that can support linguistic searches and proximity searches. This is particularly helpful when users don't know the exact names of the products they're looking for. Late last year, an in-house Staples team began building out on that technology - offering site improvements that include a more intuitive navigational framework.

Customers visiting the Staples.com site can search by keyword, item number or brand. When the words palm pilot are entered, the site retrieves postings for ballpoint pens and personal digital assistants - but "click-on" headings such as Organizer & Handheld Accessories and Ball Point Stick Pens placed to the right of the product postings link users to appropriate categories.

Intranet Search Lags

Customer goes to site, searches for product, finds it and makes purchase.

That's the e-commerce buying cycle. But when it comes to corporate intranets, things get more complicated. For one, most companies haven't invested in decent intranet technology. "Two-thirds of corporate intranets currently aren't searchable in one shot," says Hadley Reynolds, director of research at the Boston office of Delphi Group Ltd. Another problem: Assessing what users need from their intranet isn't as easy as making sure every consumer buys three things.

The promise of corporate intranets is simple: a place for employees to do things such as review best practices gleaned from past projects, make adjustments to their 401(k) plans or check the company stock price or the weather. But the reality is often very distant from that vision. "How do you define the term corporate intranet? In the past, engineers used to joke that it was just a Web site like any other - but with a very tight budget," says Ian Hersey, vice president of linguistics products at Inxight Software Inc., a knowledge extraction software company in Palo Alto, Calif.

The mandate now for intranet designers, says Hersey, is to enable people to find content beyond the parts numbers, prices and registration numbers they need to complete sales records.

At the New York law firm Skadden, Arps, Slate, Meagher & Flom LLP, one of the lures of a corporate intranet was that it would allow the firm to search past cases it had handled. The firm needed to find software that could handle large documents, was scalable and was flexible enough to adapt to future needs and be compatible with the technology already in use, according to Charmaine Polvara, knowledge manager at the firm.

Skadden, Arps set up its intranet using the Verity software that came in a package with the ColdFusion intranet server it had selected, says David Hill, systems development manager at Skadden, Arps.

The law firm then chose Verity Inc.'s Information Server, Knowledge Organizer and HTML Export as the knowledge retrieval software underlying its Web-based legal precedents system. This system gives the 1,400 attorneys at Skadden, Arps access to filings, briefings and other documents related to previous case work, says Polvara.

One of the key attractions of the Verity products was their interoperability with the firm's document management system, Hummingbird's PCDocs, says Hill.

The system also provides hit-term highlighting in both text documents and Portable Document Format files, a crucial function for lawyers searching for very specific information in briefs that run to many pages, says Hill.

"The system now allows any attorney to browse to any document stored in our databases in any form and then search the document efficiently," says Hill.

Business-to-consumer and intranet search technology, however, isn't always perfect. Accordingly, many companies are providing backups in the event of search failure. "Let's say I'm ordering a printer part at Buy.com," says Gellman at Markon Pen and Pencil. "If it doesn't work, I can call the 800 number, talk to a person and order the stuff." That, he says, is reassuring - and one of the reasons he keeps going back.

Operators Standing By

Raygoza says Buy.com is supported by an outsourced 400-operator help desk, and the company is working to shift functions from customer service representatives to the Web site.

"It used to be that anything like a request of cancellation involved a phone call, but now we direct people to a place they can do it on the site, and our calls have decreased dramatically," says Rayzoga.

"No matter how good and how user-friendly your search technology and your site are, people are still apprehensive about placing orders on the Web. We can take them to the product with the technology, but it's hard for them to put their trust and their money in a software application. It's changing, but a lot of people still feel better with a human guiding them through difficulties." wJohn is a freelance writer in Menlo Park, Calif. Staff reporter Mathew Schwartz contributed to this article.

Prominent Players

Over the past three years, a number of companies have developed or expanded their Internet search offerings to corporate users. Here are some of the prominent or promising players:

Ask Jeeves

Ask Jeeves Inc.'s Ask.com Web site provides Internet links to questions posed to the search engine in the form of standard, conversational sentences.

Once a question is posed, the company's proprietary artificial intelligence software breaks down the syntax and meaning of the query and compares the information with other questions in its database. Jeeves then returns the answer, along with results from other search engines.

Corporate Web sites powered by Ask Jeeves - such as those of Microsoft, Nike Inc. and Micron Electronics Inc. - license the "natural language" technology that enables online customers to make inquiries the same way they would in person or over the telephone. In addition to using artificial intelligence technology, Ask Jeeves employs a team of editors and librarians who classify queries and answers. The teams assist in customizing natural-language query technology for companies, enabling them to enter information about their products into the knowledge base.

Searchers are also able to talk to human beings through the Jeeves Live Version 4.0 programs, which enable Web-based customer interaction through real-time, voice- and text-based collaboration between customers and live agents.

Launched in 1997, Ask Jeeves is based in Emeryville, California.

Inktomi

Started in 1996 by a computer science professor and his first graduate student at the University of California at Berkeley, Inktomi is a search engine that focuses exclusively on providing search technology to other sites.

As of May, software produced by Foster City, California-based Inktomi powered more than 80 sites on six continents, performing more than 80 million searches per day, according to company officials.

The Inktomi Search Engine is now used by many of the world's leading search sites, including HotBot, NBC's Snap and Yahoo. In what may be an indication of the future for search engine companies, however, Inktomi and Yahoo Inc. announced last week that they had formed a partnership that will make Inktomi's technology the basis for the Yahoo corporate intranet. At the same time, Santa Clara, Calif.-based Yahoo will phase the Inktomi search engine out of its public search site.

Google

As Inktomi retires from the Yahoo public site, Google Inc. will move in as the default search provider for Yahoo, according to an announcement by the companies last week. Google claims that its most recently released index, comprising more than 1 billion Web addresses, makes it the largest search engine available today.

Two Ph.D. candidates at Stanford University developed a search engine that uses mathematical algorithms to determine the importance and relevancy of Web pages and then founded Mountain View, California-based Google Inc. in 1998.

Whereas most search engines use a keyword or metasearch technology, Google is hypertext-based - a feature that helps ensure that the most important result comes up first. Google also provides searchers with an excerpt from the Web page with the search terms highlighted in bold type.

Xdex

One newly released search engine system for the corporate intranet market is Columbia, Md.-based Sequoia Software Corp.'s Xdex - a shrink-wrapped XML indexing engine that can catalog XML-based business documents, even when users don't know where the document is stored.

Originally part of a health care software system created to store and index patient information, Xdex uses the hierarchical structure of XML to search for tag and value pairs, enabling users to conduct searches within content context.

- Lauren John

Join the newsletter!

Error: Please check your email address.

More about AltavistaAmazon.comAsk JeevesBordersBuy.comDelphi AustraliaGoogleHummingbirdInktomiInxightInxight SoftwareMarkon Pen and PencilMicronMicron ElectronicsMicrosoftNBCNECNielsenNielsen Norman GroupNikeNormanStanford UniversityVerityYahoo

Show Comments

Market Place