A major mystery behind Microsoft's 'brain-like' speech-to-speech translator

Speech-to-speech translation technology Microsoft is prepping for Skype has an unexplained capability: the more languages it learns, the better it becomes at the languages it learned first.

Microsoft CEO Satya Nadella introduced Skype Translate at the Code Conference in Rancho Palos Verdes, Calif., saying that it could translate between spoken languages, but he had no explanation for why its facility with languages learned early-on get a turbo boost from learning more.

+ Also on Network World: Reddit participants' let-loose complaints over Microsoft's Surface 3 | Ballmer talks turkey on buying the L.A. Clippers +

"Say you teach it English, it learns English," Nadella says. "Then you teach it Mandarin, it learns Mandarin but it becomes better at English. And then you teach it Spanish, it'll get good at Spanish but it gets great at both Mandarin and English, and quite frankly none of us know exactly why. It's brain-like in the sense of its capability to learn. It's magical."

The phenomenon is known as transfer learning, Nadella says, and is part of a pre-beta application for Skype that will be available to a limited test group later this year, according to Gurdeep Singh Pall, who heads up Skype for Microsoft.

In the demonstration, Pall spoke Indian-accented English via Skype with a German-speaking Microsoft employee, Diana Heinrichs.

The scripted conversation was stilted in that there was a little lag waiting for the translation to start, then the recipient had to listen to the translation and respond. For example, Pall said, "Hello, Diana, how are you doing?", which took two seconds to say, followed by a two-second pause until the translation started. There was an apparent network delay, too, as Heinrichs seemed still to be listening after the translation heard on Pall's end was complete.

The translations were spoken with mechanical voices, one male and one female, and simultaneous written translations appeared across the bottom of the screen. Microsoft says the application employs both Skype speech and instant messaging.

Not all the translation was perfect. One sentence spoken by Heinrichs was translated into English as, "I have many meetings with my colleagues in Redmond, and I take the opportunity to see her fiancée my."

Skype Translate draws on technology demonstrated two years ago at a conference in China, Microsoft says. That demonstration translated the keynote spoken in English into Mandarin. It differed from the Skype demo in that the Mandarin translation mimicked the voice of the speaker, Rick Rashid, who was worldwide head of Microsoft Research at the time, according to a Microsoft Research blog.

The technology introduced at the time called for translating the spoken word into text, then translating that into Mandarin and reproducing it as speech using an algorithm that approximated the sound of Rashid's voice.

Microsoft didn't say what the process was beneath the covers for Skype Translate, but Nadella says it is a mix of speech recognition, machine translation and speech synthesis. "It's not just about daisy-chaining these three technologies and bringing it together. In fact it's this deep neural net that you build that synthesizes a model to be able to do speech recognition," he says.

The underpinnings of the Skype Translate are also drawn upon to power Microsoft's digital assistant, Cortana, that responds to oral queries with spoken answers.

Microsoft says it will have more details closer to the beta test about how many languages Skype Translate will support and how it becomes aware of what languages it is dealing with in each conversation.

Tim Greene covers Microsoft and unified communications for Network World and writes the Mostly Microsoft blog. Reach him at tgreene@nww.com and follow him on Twitter@Tim_Greene.

Read more about software in Network World's Software section.

Tags redditskypeMicrosoftsoftware

More about MicrosoftSkype

Comments

Comments are now closed

Dick Smith replaces legacy IT infrastructure by stealth

READ THIS ARTICLE
DO NOT SHOW THIS BOX AGAIN [ x ]