The future of enterprise search

Future search systems will get to know the user as much as the content they crawl, analyzes, and indexes

A startup company called Powerset gained a slew of headlines last week when it launched a beta version of its search engine, which like other offerings employs natural language processing, allowing users to search sets of information in the form of questions.

But the future of search, particularly within enterprises, will go well beyond processing queries or parsing content. Future search systems will get to know the user -- and communities of users -- as much as the content it crawls, analyzes and indexes, observers say.

"Relevance is in the eye of the beholder -- what's relevant for me may not be relevant for you. Consequently, what's needed is a profile of the user (interests, vocabulary, previous searches, job title, etc.) and a profile of the content (author, subject, date, who's read it, etc.) Great search matches the two up," said Guy Creese, an analyst with Burton Group, via e-mail.

"To do that, these profiles need to be equally sophisticated. Enterprise search vendors for a long time have spent a lot of effort on profiling content, but not profiling users. This will change over time, as systems such as Amazon.com make it clear that knowing a lot about the user makes it easier to find and suggest relevant content."

For example, Creese said, if a user was a network engineer and entered "ATM" as a query, a smart search system could rank results for "asynchronous transfer mode" more highly than "automated teller machine."

While many companies have a role to play and products that work, Google is the company to watch in the long term if you want to know where enterprise search is headed, according to analyst Stephen Arnold.

[ See related story: Could Google's 'dataspaces' reshape search? ]

"When you hear the big companies saying, we are doing an enterprise solution and Google isn't a problem, you have to ask yourself, are these guys connected to reality?" he said during a recent speech at the Infonortics Search Engine Meeting in Boston. "Buying into the Wall Street crowd's [contention] that this is an advertising company is crazy."

In the meantime, the search market has fragmented into a few distinct size classes, analysts say: offerings from major vendors like IBM, Oracle and with its recent acquisition of FAST Search & Transfer, Microsoft; larger independents such as Autonomy; and smaller, specialized vendors.

Arnold recently wrote a nearly 300-page study for Gilbane Group, "Beyond Search," that takes a deep dive into the facets of the enterprise search market. While in terms of size, search-focused companies are spread among only a handful of categories, but they vary widely in terms of their technological focus. These are among the sub-segments Arnold identified:

Database-centric systems, such as Teratext and Intelligenx. "Because of this, these systems are adept at handling data management, content repurposing, and generating reports from the content that reside in the system's database," he wrote.

Companies involved in "deep analysis" of content, which include Attensity and Siderean Software. "The use of multiple processes in iterative cascades point to the direction search and content processing is moving. Simple key word indexing is a Model-T Ford to these vendors' finely tuned machines."

"Tools" companies like SchemaLogic sell software that helps customers organize and prepare their content to be searched, according to Arnold. "Most licensees of search systems don't know what they don't know," he wrote. "Once you have some experience with behind-the-firewall search, you have a better understanding of the importance of controlling and managing metadata."

There are also "building block," "linguistic processing" and "pattern analysis" vendors, Arnold wrote.

Though a plethora of companies are vying for market share, there may be plenty to go around. Analyst firm Gartner recently predicted search technology will locate and analyze more than 90 per cent of the data in more than half of the Global 2000 by the end of 2012.

More about: Amazon.com, APT, Burton Group, Fast Search & Transfer, Gartner, Google, Honeywell, IBM, Microsoft, Oracle, Transportation, VIA, Wall Street, Yahoo
References show all

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
Users posting comments agree to the Computerworld comments policy.
Login or register to link comments to your user profile, or you may also post a comment without being logged in.
Related Whitepapers
Latest Stories
Community Comments
Whitepapers
All whitepapers
Sign up now to get free exclusive access to reports, research and invitation only events.
Featured Download
/downloads/product/15/angry-ip-scanner/

Angry IP Scanner

Angry IP Scanner (or simply ipscan) is an open-source and cross-platform network scanner designed to be fast and simple to use. It scans IP addresses ...

Computerworld newsletter

Join the most dedicated community for IT managers, leaders and professionals in Australia