Recording telephone conversations between customers and agents is commonplace in call centers. Usually it's done for quality or regulatory purposes with technology that automatically records, logs and stores the audio files.
What happens to those recorded assets often is considerably less high-tech. A manager might randomly select a few phone calls per month for each agent and listen to replays to evaluate agent performance. Or training staff might single out calls that resulted in a strong sale to use for educational purposes. But often companies simply ignore the bulk of recorded assets because it's too expensive and time-consuming to manually review thousands of customer phone calls.
These days, call-mining specialists such as CallMiner and Nexidia Inc. and larger speech technology vendors such as ScanSoft and Witness Systems are aiming to change that with technology that does for audio assets what business intelligence software does for structured data.
Call-mining technology combines speech recognition, speech analysis and data-mining capabilities to make it easy for companies to find specific information in audio archives and spot service gaps, sales opportunities and emerging customer trends.
The software can run keyword-based searches to find instances when callers spoke certain product names or used phrases associated with dissatisfaction such as "speak to a manager." The software also correlates different attributes of calls to report trends - such as how often the mention of a competitor's product resulted in a service cancellation.
One prison uses technology from start-up CallMiner to ferret out code words for contraband. The CallMiner software stores reference data about word-usage trends and can highlight when words that are not frequently used in normal conversation suddenly increase in prisoners' phone conversations, says Jeff Gallino, CEO of CallMiner.
In the past, by the time officials figured out "lollipop" was a code word for a certain drug, for example, the prisoners already would have started using a new code word. CallMiner's speech analytics can detect within a couple of hours when an atypical word suddenly is used more frequently, Gallino says.
Call-mining software also can search for phrases that agents didn't say -- but maybe should have been by agents for legal reasons. For example, financial services transactions can require agents to cite regulatory disclosures to customers. Companies can search for instances when those disclosures were not made, but should have been, says Anna Convery, senior vice president of marketing and product management at Nexidia.
Continental Airlines is one early adopter that's rolling out call-mining software. The airline uses eQuality CallMiner -- a call-mining platform that combines technology from CallMiner and Witness Systems -- to perform automated call classification processes at its 900-agent reservation center in Tampa, Fla.
Continental classifies incoming calls into 50 different categories, depending on whether a customer called to book a flight, confirm flight information, change a seat assignment or redeem reward miles. With eQuality CallMiner, Continental automates the process of compiling its "call mix" survey.
In the past, Continental completed monthly surveys, but now it can review the data daily, said Andre Harris, director of reservations training and quality for Continental Airlines, in a statement. The software provides more context and intelligence than manual methods, and management can review a much larger sampling of interactions, Harris said.
There are two general approaches to call mining: speech-to-text and phonetic.
CallMiner's software is speech-to-text. It uses speech recognition to convert calls into searchable text, and uses speech-analysis techniques to generate call statistics about what was said and the conversation context. The data is stored in searchable databases for mining and gathering business intelligence.
Nexidia's software uses phonetics. The software breaks down audio into phonemes, which are speech sounds or utterances that represent one distinctive sound. For example, when a user runs a query for a word or phrase, the software identifies the relevant phonemes, then indexes and stores the results in a database for review.
Phonetic-based call mining is faster and handles searches for multilingual audio utterances more readily, Schoeller says. Speech-to-text processing tends to consume more CPU resources, he adds.
Speech-to-text advocates acknowledge that initial call processing can take longer, but say subsequent searches are more efficient because all the audio content already was converted into database form.
Typical early adopters, such as financial services companies, airlines, telephone and cable companies, and government agencies, have driven increased interest in call mining over the last few years, analysts say.
"The problem of not having enough hours in the day for supervisors to listen to calls to provide more training and feedback crosses many of these verticals," says Art Schoeller, a senior analyst at The Yankee Group.
Government agencies, in particular, are turning to call mining and speech analytics to help pore through huge amounts of audio secured for homeland security initiatives, says Daniel Hong, voice business analyst at Datamonitor PLC. "If they use call mining, they're able to do that quite a lot faster, at a fraction of the cost, and on an on-going basis," he says.
Performance improvement is one reason for recent interest, according to Schoeller. In the past, processing audio files required a huge CPU commitment.
"At one time, it took a pretty hefty server one hour to process one hour of audio. You take a 100-agent call center, one-and-a-half shifts per day, you get 1,200 hours of audio to process," Schoeller says.
Today, some vendors say they can process 40 hours of audio for each hour of CPU time, Schoeller says. "We have also seen continued, gradual refinements in the algorithms themselves to improve accuracy and speed," he says.
Prices have become more affordable. Nexidia says it estimates a typical installation would cost between US$100,000 and $300,000 -- companies tend to grow their deployments as they get comfortable using speech analytics, Nexidia says. CallMiner says it estimates a 200-seat call center would spend about $450,000 for its call conversion engine and analytic suite.
Looking ahead, applications for searching audio and video will emerge outside the call center, Schoeller says. For example, a company might have in-house media libraries with training videos or presentations that could be indexed.
But it won't happen overnight. While expectations are higher, the reality of audio mining is that it's still an emerging science, and adoption remains in the early stages.