As big data gathers momentum, it's helping to create big career opportunities for IT professionals -- if they have the right qualifications.
According to a report published in 2011 by McKinsey & Co., the U.S. could face a shortage by 2018 of 140,000 to 190,000 people with "deep analytical talent" and of 1.5 million people capable of analyzing data in ways that enable business decisions.
Companies are, and will continue to be, looking for employees with a complex set of skills to tap big data's promise of competitive advantage, market watchers say. "There's no question that the No. 1 requirement [for] enterprises that are serious about gaining a competitive advantage using data and analytics is going to be the talent to run that program," says Jack Phillips, CEO of the International Institute for Analytics (IIA), a research firm.
But what exactly constitutes "big data talent"? What are these jobs, and what skills do they require? What kind of background qualifies a person for a big data job? Computerworld took the pulse of some prominent players in the emerging field to determine an IT worker's place -- if any -- in the big data universe. Here's what they had to say.
Buckets of Skills
"There is no monolithic 'big data profession,' " says Sandeep Sacheti, former head of business risk and analytics at UBS Wealth Management, who now holds the newly created position of vice president of customer insights and operational excellence at Wolters Kluwer Corporate Legal Services.
Sacheti's new job is all about big data: using analytics to understand customers, develop new products and cut operational costs. In one project, the Wolters division that sells electronic billing services to law firms is using analytics to mine data it gathers from its customers (with their permission) to create new products, including the Real Rate Report, which benchmarks law firm rates around the country.
Sacheti is now both hiring from the outside and training internal staffers for big data work. He thinks of big data jobs in terms of four "buckets of skill sets": data scientist, data architect, data visualizer and data change agent.
But there are no standard titles -- other employers likely use different buckets and value different skills. What one company calls a data analyst, for example, might be called something different elsewhere, says John Reed, senior executive director at IT staffing firm Robert Half Technology. And, as Sacheti's title demonstrates, some big data jobs contain neither the word big nor the word data.
Some companies come to the IIA for help recruiting big-data talent, Phillips says. First they ask where to look for candidates. "Then they stop in their tracks and say, 'Wait, how do I know what I'm looking for?' " he adds.
"Everybody's asking, 'How do you identify these people? What skills do you look for? What is their degree?' " says Greta Roberts, CEO of Talent Analytics, which makes software designed to help employers correlate employees' skills and innate characteristics to business performance.
Roberts, Phillips and other experts say the skills most often mentioned in connection with big data jobs include math, statistics, data analysis, business analytics and even natural language processing. And although titles aren't always consistent from employer to employer, some, such as data scientist and data architect, are becoming more common.
A Curious Mind Is Key
As companies search for big data talent, they're tending to target application developers and software engineers more than IT operations professionals, says Josh Wills, senior director of data science at Cloudera, which sells and supports a commercial version of the open-source Hadoop framework for managing big data.
That's not to say IT operations specialists aren't needed in big data. After all, they build the infrastructure and support the big data systems.
"This is where the Hadoop guys come in," says D.J. Patil, data scientist in residence at Greylock Partners, a venture capital firm. "Without these guys, you can't do anything. They are building incredible infrastructure, but they are not necessarily doing the analysis."
IT staffers can quickly learn Hadoop through traditional classes or by teaching themselves, he notes. Burgeoning training programs at the major Hadoop vendors are proof that many IT folks are doing so.
That said, most of the jobs emerging in big data require knowledge of programming and the ability to develop applications, as well as an understanding of how to meet business needs.
The most important qualifications for these positions aren't academic degrees, certifications, job experience or titles. Rather, they seem to be soft skills: a curious mind, the ability to communicate with nontechnical people, a persistent -- even stubborn -- character and a strong creative bent.
Patil has a Ph.D. in applied mathematics. Sacheti has a Ph.D. in agricultural and resource economics. According to Patil, the qualities of curiosity and creativity matter more than one's field of study or level of academic credential.
"These are people who fit at the intersection of multiple domains," he says. "They have to take ideas from one field and apply them to another field, and they have to be comfortable with ambiguity."
Wills, for example, took a circuitous path to the role of data scientist. After graduating from Duke University with a bachelor's degree in math, he pursued a graduate degree in operations research at the University of Texas on and off while working for a series of companies before dropping out to take a job at Google in 2007. (He notes that he did eventually complete that master's degree.) Wills worked at Google as a statistician and then as a software engineer before moving to Cloudera and assuming his data science title.
In short, big data folks seem to be jacks of all trades and masters of none, and their greatest skill may be the ability to serve as the "glue" in an organization, says Wills. "You can take someone who maybe is not the world's greatest software engineer [nor] the world's greatest statistician, but they have the communications skills to talk to people on both sides" as well as to the marketing team and C-level executives, he explains.
"These are people who cut across IT, software development, app development and analytics," Wills adds, noting that he thinks such professionals are rising in prominence. "I'm seeing a shift in value that companies are assigning to these people," he says.
Sacheti, too, keeps his eye out for people like that. "We are finding there are a lot more who are flexible in learning new skills, willing to do iterative design and agile thinking," he says.
Roberts agrees. "The innate characteristics of people, like a predisposition to curiosity, can be more predictive of someone's performance in a role than them having a degree in, say, IT or IS or CS," she says.
Wanted: Relentless, Scientific Temperament
Until recently, creativity, curiosity and communications skills haven't typically been emphasized in IT departments, which may be why many employers aren't looking to their IT operations staffs to find people to spearhead big data projects.
The IIA sees data science as resting on three legs: technological (IT, systems, hardware and software), quantitative (statistics, math, modeling and algorithms) and business (domain knowledge), according to Phillips. "The professionals we see who are successful come from the quantitative side," he says. "They know about the technology, but they aren't running the technology. They rely on IT to give them the tools."
Big data also demands a scientific temperament, says Wills. "When we talk about data science, it's really an experiment-driven process," he explains. "You're usually trying lots of different things, and you have to be OK with failure in a pretty big way." Wills goes on to say that there's a "certain kind of relentlessness you need in the personality of someone who does this kind of work."
Big data professionals also have to be intellectually flexible enough to quickly change their assumptions and approaches to problems, says Brian Hopkins, an analyst at Forrester Research. "You can't limit yourself to one schema but [need to be comfortable] operating in an environment with multiple schemas or even no schemas," he says.
That tends to be a different approach than most IT people are used to. "IT people coming out of a strong enterprise IT shop are going to perhaps be constrained a little bit in their ability to do things quickly and move fast and be agile," Hopkins says.
But once hiring managers find the right type of person, they're usually willing to retrain that person to fill a big data role. For example, Patil used to work at LinkedIn, where, he says, "we largely trained ourselves, because so much of this is open source." He thinks the same thing can happen at most companies. "You can make these people" -- if they have the right personality, he says.
IT workers who are flexible, willing to learn new tools and have a bit of an artist somewhere within can move into data architecture or even data visualization, says Sacheti.
In short, big data carries big potential for IT pros who would relish an opportunity to show their creativity.
Frequent Computerworld contributor Tam Harbert is a Washington, D.C.-based writer specializing in technology, business and public policy.
This version of this story was originally published in Computerworld's print edition. It was adapted from an article that appeared earlier on Computerworld.com.
Big Data Job Titles and Skills
Without conventional titles, or even standard qualifications, it's hard to know what makes someone suitable for a big data job. This listing, based on interviews with big data experts and recruiters, attempts to match up some of the most common titles with the skills required.
• Data scientists: The top dogs in big data. This role is probably closest to what a 2011 McKinsey report calls "deep analytical talent." Some companies are creating high-level management positions for data scientists. Many of these people have backgrounds in math or traditional statistics. Some have experience or degrees in artificial intelligence, natural language processing or data management.
• Data architects: Programmers who are good at working with messy data, disparate types of data, undefined data and lots of ambiguity. They may be people with traditional programming or business intelligence backgrounds, and they're often familiar with statistics. They need the creativity and persistence to be able to harness data in new ways to create new insights.
• Data visualizers: Technologists who translate analytics into information a business can use. They harness the data and put it in context, in layman's language, exploring what the data means and how it will impact the company. They need to be able to understand and communicate with all parts of the business, including C-level executives.
• Data change agents: People who drive changes in internal operations and processes based on data analytics. They may come from a Six Sigma background, but they also need the communication skills to translate jargon into terms others can understand.
• Data engineers/operators: The designers, builders and managers of the big data infrastructure. They develop the architecture that helps analyze and process data in the way the business needs it. And they make sure those systems are performing smoothly.
"The people who do the best are those that have an intense curiosity," says D.J. Patil, data scientist in residence at Greylock Partners. Patil probably knows what he's talking about: Forbes magazine credits him and Cloudera founder Jeff Hammerbacher with coining the term data scientist. And earlier in his career, Patil helped develop the data science team and strategy at LinkedIn.
- Tam Harbert
Read more about management in Computerworld's Management Topic Center.