Rod Adkins, general manager for IBM Corp.'s pervasive computing division, used his keynote address at the VOX 2002 speech technology conference here to both promote speech technology in the enterprise and to criticize long-time rival Microsoft Corp. for fragmenting the speech industry with the introduction of another standard for speech development.
"A little salt is okay, but too much SALT is not good for you," Adkins said, referring to the Microsoft-supported development environment for speech-enabling Web sites and for so-called multi-modal interaction on non-PC devices.
Multi-modal refers to the ability to mix interfaces, voice, and graphics on a single device.
Last year, when the SALT (Speech Application Language Tags) Forum introduced technology supported by Microsoft, SpeechWorks, and other companies, SALT members claimed that Voice XML, the current speech development standard, was adequate for telephony-based applications but too cumbersome for small devices using mixed interfaces.
Although Adkins said the debate between Voice XML and SALT was not an argument about us versus them, he did appear to couch it in those terms.
"Our colleagues in Redmond [Wash.] have one approach, and we recognize some developers will use SALT and Microsoft will eventually support its developers," Adkins said.
Adkins claimed a more reasonable approach would say that there has to be a single standard-based approach for tools to be developed, skills taught, and middleware built.
IBM, Motorola, and Opera Software sponsored a multi-modal specification, called X+V, and submitted it to the W3C last December. The W3C previously approved Voice XML as the standard for speech development.
"We think xHTML and Voice, X+V, is a sounder approach to capitalize on what is out there: 50,000 VXML developers," Adkins said.
IBM and Opera Software will also work jointly to develop a multi-modal browser based on the X+V specification, said Adkins. The beta version of the browser will be available this fall and will give users both Web-based and voice-based information on a single device.
Adkins ended his talk by reiterating that in the current economic climate the game plan should be to exploit existing resources and that companies will not want to fund the development of new skills, new learning, and new investment to roll out voice solutions, as would be required if they selected SALT over Voice XML.