The R programming language continues to permeate the big data environment. Data analysis platform provider 1010data has added R to its arsenal, allowing its many enterprise customers to interrogate their data with the statistics-oriented programming language.
"We know that a lot of data scientists and modelers have grown up using R, and that is the environment they are comfortable with," said Jed Alpert, 1010data vice president of marketing. "They build data models in R and then have the power of our platform to run the analytics on all of the data."
Founded in 2010, 1010data provides organizations with a set of services for analyzing large data sets, eliminating the need to set up systems to do the work in house.
With the new option of using the R language, organizations won't need to train their data scientists in the 1010data's own query language. Like other 1010data services, R can be accessed through a browser.
The new service will also benefit long-time users of R who wish to use the language to investigate larger data sets, something that has been fairly difficult until recently. The stock implementation of R is a single-threaded application, which means it can't be used effectively on data sets distributed across multiple servers. For the service, 1010data developed their own software to run R against large, distributed data sets.
With millions of users worldwide, R is one of the most widely used programming languages specifically designed for statistical computing and predictive analytics, alongside SAS, MatLab, Mathematica and a number of Python libraries. Its popularity has grown as more organizations take on big data analysis to learn more about their customers and improve operations.
"R is really good at allowing users to modify different statistical analysis methods to meet their needs," said Chris Simon, 1010data senior analyst.
A number of other companies have also recently extended R for big data use. Hewlett-Packard created Distributed R, an open source package to run the language across computer clusters. Microsoft, which recently purchased R distributor Revolution Analytics, offers the R language as an interface for its machine learning cloud service.
More than 700 organizations use 1010data, including many large companies in the fields of retail, manufacturing, telecommunications, and financial services. The New York-based 1010data maintains more than 19 trillion rows of data on behalf of these clients.