Computerworld

Could Google's 'dataspaces' reshape search?

Do not expect Google to suddenly bring a game-changing product to market

Google -- the company most identified with Web search -- is not the leading player behind the firewall, claiming about 9,000 customers are using its enterprise search products. Independent search vendor Autonomy says it has 17,000.

Still, in his recent report "Beyond Search," for Gilbane Group, analyst Stephen Arnold portrays the company as a quietly humming engine of activity, with work under way that could "leapfrog" the current generation of search technology.

[ For more on the future of enterprise search technology, see this related story. ]

Arnold, who closely tracks Google's patent applications, is especially interested in a concept called "dataspaces," which stems from the work of Google researcher Alon Halevy. Dataspaces, in Arnold's view, take "content processing into a new dimension."

"A dataspace should contain all of the information relevant to a particular organization regardless of its format and location, and model a rich collection of relationships between data repositories," Halevy wrote along with two co-authors in a December 2005 paper. "Hence, we model a dataspace as a set of participants and relationships."

"The participants in a dataspace are the individual data sources: they can be relational databases, XML repositories, text databases, Web services and software packages," the paper states at another point. "A dataspace should be able to model any kind of relationship between two (or more) participants."

While other vendors are pursuing similar goals, they cannot compete on scale with Google, according to Arnold.

"Even the most robust content processing systems have not been engineered to handle Google-level content flows. The implication of scale means Google is operating largely without competition from the companies profiled in this study," he wrote in "Beyond Search."

Meanwhile, Google indeed appears to have ambitious search and content-processing projects in the patent pipeline that echo the dataspaces concept.

One in particular, US Patent No. 20070198481, "Automatic Object Reference Identification and Linking in a Browseable Fact Repository," describes an invention that crunches together a wide range of data on an individual or topic into a kind of dossier.

Google declined to comment on patent applications or make Halevy available for an interview.

"We file patent applications on a variety of ideas that our employees come up with," a company spokesman said via e-mail. "Some of those ideas later mature into real products or services, some don't."

But a company executive was willing to paint the company's search in general terms.

"Inside an enterprise, and maybe unlike the Internet, you can know a lot about a user," such as who they report to, said Matthew Glotzbach, director of product management for Google's enterprise division. "There's a lot of empirical information you can derive. All of that can be used to create a very, very rich profile about the user, which can then be used to create a really rich search experience."

Do not expect Google to suddenly bring a game-changing product to market, according to Glotzbach.

"The model is not these kind of big-bang approaches where we work for multiple years and then roll something out. In terms of what we do in enterprise search, you'll see a constant flow, as opposed to one sort of big bang -- here's a whole new thing," he said.

More about: Empirical, Google, VIA
References show all

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
Users posting comments agree to the Computerworld comments policy.
Login or register to link comments to your user profile, or you may also post a comment without being logged in.
Recent Discussions
Whitepapers
All whitepapers
tracking pixel
 
Computerworld Community Comments
Zones
SAS Resource Centre

This Resource Centre hosts a wealth of thought leadership articles, whitepapers, and success videos, to help you make the most out of your corporate information in order to swiftly make sound business decisions to survive and thrive in the current economic climate.

Oracle Resource Centre

News, Features and the latest whitepapers on SOA, Application Grid, Enterprise Management and Database

Sponsored Links
 
Back to top Sitemap
Copyright 2009 IDG Communications. ABN 14 001 592 650. All rights reserved.
Reproduction in whole or in part in any form or medium without express written permission of IDG Communications is prohibited.