Startup OnlyBoth turns IBM's Watson on its head

OnlyBoth, a Pittsburgh startup whose first offering is an app that generates often surprising insights about U.S. colleges and universities, has seriously been a long time coming.

Co-founder Raul Valdes-Perez says the cloud-based software at the heart of his new business stems from work begun in 1998 and funded via a National Science Foundation grant, but that the "niche finding" technology was shelved when a more immediate business opportunity presented itself in the form of Vivisimo. Valdes-Perez co-founded that clustering-powered search company at Carnegie Mellon University and sold it to IBM in 2012 (he was chairman of Vivisimo at the time, so never wound up working for IBM).

+ Also on NetworkWorld: Top techie college commencement speakers for 2014 +

CEO Valdes-Perez describes the self-funded OnlyBoth as taking sort of a reverse approach to IBM's Watson artificial intelligence technology, which can make sense out of unstructured information, as was on display during its famous Jeopardy! game show performance in 2011 (Hear Valdes-Perez compare the technologies below or here). OnlyBoth algorithms sort through structured big data --  initially on 3,122 U.S. colleges and universities described by 190 attributes and spit out fun facts and comparisons in the form of perfect English sentences. The company's motto:  "A sentence is worth 1,000 data".

"We're about structured data in, unstructured data out," says Valdes-Perez, a computer scientist and adjunct professor at CMU. "It was a hard problem to solve."

For example, when I popped "Massachusetts Institute of Technology" (or rather, "MIT") into the search box on OnlyBoth, the Web-based service turned out this gem: "MIT spends the most on research ($1,128M) among all 3,122 colleges," plus it shared some comparative info for Johns Hopkins, Stanford and others. You can get up to 25 insights per school, and can sort by topics, such as dorm capacity and Rhodes Scholar alumni. You can also discover "surprising" facts and compare schools to their neighbors and rivals. The app is linked to Facebook, Twitter and other social apps in case you want to share findings with your connections, and Valdes-Perez says he expects this is one of the main ways that people will learn about OnlyBoth.

Valdes-Perez warns that top schools like Harvard University return among the most boring results since they're so strong across the board. About one in six results on most schools, however, could be classified as being "on the negative side," Valdes-Perez says.

"One value of this technology is that it really shines a light on data, it makes what's hidden transparent," Valdes-Perez says. "Lots of people talk about college transparency... one way to do it is create a dataset  and generate this sort of content."

The idea that Valdes-Perez would start a company focused on bringing simple facts to light isn't surprising when you get to know him a bit and learn, for example, that he forgoes a "distracting" smartphone in his personal life for a plain old dumb phone.  

Valdes-Perez claims his team isn't worried about how to make money off the initial app, but will use the experience with it to learn about how people interact with it and what future applications might be. "We'll worry about making money in the future," he says.

Initially, users will find the web service at the OnlyBoth website, but Valdes-Perez says the company has fairly liberal terms of use that would let people put the technology on their websites. He says the dataset used to feed OnlyBoth is based largely on federal government reports, which are released every couple of years, though also includes data collected by the team, such as alumni that go on to play in the National Football League.

Future apps could focus on topics such as sports, politics, genetics, investing and even journalism, says Valdes-Perez, who started OnlyBoth with CTO Andre  Lessa, who he recruited to work at Vivisimo and then united with on a couple of projects there.

The initial OnlyBoth app might be described as a consumer offering, but Valdes-Perez says there's no reason that enterprises, government agencies and naturally, universities, couldn't all put the technology to work for them. It's possible that data collected through sensors might be automatically funneled into structured datasets that OnlyBoth algorithms could go to work on, but Valdes-Perez says the technology would most typically make sense for use with datasets of interest to people and compiled through human effort.

Oh, and one finding you probably won't get via OnlyBoth's initial app: Where the company gets its name. The software occasionally generates statements in the form "Only A is both P and Q" or "Only A has both as many X and as many Y".

Bob Brown tracks network research in his Alpha Doggs blog

Read more about software in Network World's Software section.

Tags Carnegie Mellon UniversityapplicationsCloudWDIBMsoftwareinternetcloud computing

More about AlphaCarnegie Mellon UniversityFacebookHarvard UniversityIBMMassachusetts Institute of TechnologyMellonMITNational Football LeagueTechnology

Comments

Comments are now closed

Data retention: iiNet raises spectre of ‘surveillance tax’ for ISP customers

READ THIS ARTICLE
MORE IN Storage
DO NOT SHOW THIS BOX AGAIN [ x ]