SQL to NoSQL: A match made in heaven for eHarmony.com
- 19 August, 2013 18:26
For online relationship match-making site eHarmony, pairing up two people to potentially spend the rest of their lives together is no simple task.
The company provides each member an in-depth personality test made up of dozens of questions aimed at determining exactly who someone is and what type of person they would be compatible with. That ends up being a lot of data the company has to manage. But this isn't just any type of data that fits neatly into rows and columns. Some people want to sort potential matches by only a handful of important characteristics while others want an entire gamut full of categories to be compatible. Members have photos of potential matches to see if they've found Mr. or Mrs. Right.
Since being founded in 2000, eHarmony.com has relied on a power open source relational database, PostgresSQL. The centralized server-nature of the SQL system made it difficult to scale the database across a distributed platform though. Something new was needed. After CTO Thad Nguyen explored other SQL options, like MySQL from Oracle, eHarmony turned to NoSQL database MongoDB instead.
While the adoption of NoSQL databases is still in the early stages, some companies like eHarmony are beginning to realize the limitations of their SQL relational databases and are looking to next-generation NoSQL databases for answers. While NoSQL databases are generally regarded as being more easily scalable, lower cost (because many times they run on commodity infrastructure), the major difference between the two is the non-relational nature of NoSQL databases.
"You're not concerned about keeping data in synch between tables to maintain the relational integrity," says Nick Heudecker, a Gartner analyst who tracks the NoSQL market. This allows users of NoSQL systems to create different types of database architectures instead of in a relational world, which is based on graphs and tables for data storage.
Over at eHarmony, MongoDB allows Nguyen's engineers to create new database architectures on the fly, giving users more flexibility to create customized multi-attribute searches, and matching users in a bi-directional basis (meaning both users have attributes that match with one another, as opposed to a system like Netflix in which a user is choosing a movie, but the movie doesn't care which user watches it).
Using the NoSQL database on its distributed system has vastly increased the speed with which the company can process its data. The company's entire user set can be searched for matches in 12 hours now, as opposed to it potentially taking weeks beforehand thanks to the sharing and distributed nature of the platform. "We have close to 1 billion potential matches we're pouring through every day," Nguyen says, so he needed a database that could keep up and elastically scale to that demand while keeping costs in order. "It provides the scalability we need, with high throughput, built in support for high availability and the ability to support rich and complex queries."
EHarmony's use case is not the norm, Gartner's Heudecker says. More common, he says, are enterprises implementing NoSQL databases for specific use-case projects, in testing and development environments, or new database deployments. "You don't see a lot of rip and replace," he says. But, NoSQL non-relational databases do have compelling advantages over relational SQL databases that are causing many to pay more attention to these options, either with the blessing of IT database administrators or in a "shadow IT" situation, Heudecker says.
Gartner breaks the NoSQL database market into four categories: Document store databases, like a MongoDB; key-value storage tables for content repositories; table-style databases for semi-structured data analysis, with DBs such as Cassandra being a prime example of this; and graph databases, which specialize in graphical data storage.
Since implementing MongoDB about seven months ago, through vendor 10Gen, at eHarmony's two collocation centers in Los Angeles and Las Vegas, Nguyen says the process has been fairly smooth and straightforward. Because it's an open source platform, company developers have been able to rely not just on 10Gen for support, but the broader open source community as well. The company still uses its legacy SQL databases for some back-end business process management tasks, but most of the core functionality of the company's website now runs on a NoSQL database.
Matt Asay, a vice president of business development at 10Gen, which works with companies to implement the open source database, admits that we are still early on in the NoSQL movement.
Heudecker estimates less than 5% of the addressable NoSQL database market has been penetrated thus far. But increased adoption is inevitable as businesses deal with more data that they want to use to their advantage to influence business decisions. "As we move from these systems of record to systems of engagement, where you interact more with users outside of the company, we'll see more apps being built that will benefit from leveraging NoSQL functionality," he says.
For eHarmony, using a NoSQL system ended up being a no-brainer decision. "It has tremendously improved our match quality and conversion rate," Nguyen says. Being an online system, users want fast results when they make a query. Having databases that can scale, have flexible schemas and are cost-effective because of their open source nature was just the right match for eHarmony.
Read more about software in Network World's Software section.
Join the Computerworld Australia group on Linkedin. The group is open to IT Directors, IT Managers, Infrastructure Managers, Network Managers, Security Managers, Communications Managers.
TPG should pay rural levy for each FTTB service: NBN Co
Galaxy S5 deep-dive review: Long on hype, short on delivery
NBN Co hits 105Mbps in limited FTTN trial
Satellite communication systems rife with security flaws, vulnerable to remote hacks
TPG should pay rural levy for each FTTB service: NBN Co