IBM Corp. is making a strong push to fully support Linux. One interesting offering is its ViaVoice Dictation for Linux. It has been on the Windows platform for a while, and IBM recently decided to bring its speech recognition star to the Linux platform.
I've always been interested in speech recognition technology. I'd love to write my stories while hiking or check e-mail while watching TV. The technology has come a long way in the past few years, with ever more accurate recognition and improved voice training. But it always seems that we're still a few steps away from truly useful speech recognition.
Most people are disappointed when they try speech recognition for the first time. They expect sci-fi quality voice transcription levels of recognition that require no training or customization. But human speech is highly variable and inconsistent -- no two people speak the same words in exactly the same manner, and we often use context to infer the words and meanings in speech we hear.
These tools are a solution in search of a problem. Sure, eventually Star Trek-like speech recognition will be available, but as it stands right now, the technology is more suited to narrow tasks such as dictation and simple computer control. So don't expect to throw away your keyboard and mouse and do all of your Web surfing and e-mail with your voice.
I can still remember the first computer I used with speech recognition software. It was a Mac Quadra that included software to recognize common commands -- a neat toy in the small office I worked in, but not very useful for real-world tasks.
I was impressed that it could understand multiple people without having to train it to recognize our voices. But this caused enough problems and confusion that it was disabled in short order. Any person within earshot could say, "Computer, shut down," or ask, "Computer, what time is it?" and it would respond.
I've used many speech recognition products since that old Mac was laid to rest, and great strides have been made. If you're looking for an easy way to dictate reports (or essays like this one) and don't mind spending time training it to accurately transcribe your voice, the current round of voice recognition products are right up your alley.
I recently received a copy of IBM's ViaVoice Dictation for Linux. I've used the Windows versions of ViaVoice in the past and found it to be a solid product with decent recognition capabilities. With IBM's push to support its entire line of products on the Linux platform, I was intrigued to see how this latest release measured up for voice dictation.
Overall, I was pleased with IBM's ViaVoice technology. As with many Linux products, the installation procedure still needs to be simplified and automated to make it easier for businesses to deploy. But the interface was easy to understand, and it recognized most things I said after spending only 20 minutes training it. It is intended mostly for straight speech dictation, but some commands to control the built-in text editor are included.
More important than this one speech product tool is IBM's commitment to Linux for all of its software and hardware platforms. ViaVoice Dictation is Java-based, making it perfect for cross-platform environments. With access to the technology that powers ViaVoice Dictation, developers can integrate speech recognition into the software they develop. This will help fuel more robust and accurate speech recognition tools. For more information on ViaVoice technology, see www.ibm.com/software/speech.
Speech recognition is improving, but it's not quite ready for general use. If you want to dictate simple text documents, products such as ViaVoice Dictation offer reasonable recognition and easy correction. But if you want to talk to your computer and make it follow your every whim, you'll have to wait a few more years.
With IBM's strong support of the Linux platform, including many releases such as DB2, Websphere, and of course ViaVoice Dictation for Linux, the open-source OS is finally getting the attention it deserves. Business leaders can take Linux seriously with such heavyweight support behind it.
What do you think of IBM, Linux, and speech recognition technology? Drop me an e-mail at firstname.lastname@example.org and let me know.
(Kevin Railsback is West Coast technical director of the InfoWorld Test Center.)