I've spent some time this week with the latest version of Dragon NaturallySpeaking, a dictation program that I've tried from time to time over the years. In the past, despite chronic trouble with RSI (repetitive strain injury), I could never convince myself to make dictation part of my routine working life. But with each generation of hardware and with each version of the program, the gap between desire and reality has narrowed. Now dictation technology may finally have crossed the threshold of practicality for me.
If you've never tried dictation, you can get a sense of how it works by watching a video I made shortly after I installed Version 8 of NaturallySpeaking. The out-of-the-box experience was dramatically better than before. It got even better when I fed the program all the articles and blog entries I've written during the past few years.
For me, typing remains the most efficient way to produce error-free copy. I expect it will take a few more turns of the evolutionary crank before dictation will be my first choice -- particularly because so much of my writing involves specialized markup (in text) or punctuation (in code). But you never know. As is traditional when tech reviewers write about dictation software, I am in fact dictating these words, and it's going remarkably well.
What I find most interesting about this process is the way in which I train the computer to be an intelligent assistant. Because recognition accuracy is such a difficult problem, dictation software has to pay very close attention to me. It has to learn everything it can about my speech patterns, vocabulary, and writing style. And it must leverage all this information to the maximum degree possible.
Perhaps because we imagine that other application domains are not as challenging, other programs pay strikingly little attention to what we do. Sure, the browser will remember the last thing that you typed into a field on a form, and your e-mail program will help you keep track of whom you've replied to. But by and large, our so-called productivity software does not monitor what we do, is not meaningfully trainable, and does not grow more valuable over time as our relationship with it deepens. We are creatures of habit, but we are ill-served by software that does not notice or respond to those habits. When I organize my e-mail or conduct research on the Web, I exhibit predictable patterns of behavior. We have long expected but rarely experienced personal productivity software that absorbs those patterns, automates repetitive chores, and can be taught to improve its performance.
If there is hope for the conventional, installed fat-client application, it lies here. As I mentioned last week, thin-client RIAs (rich Internet applications) can't easily collect or exploit interaction data. With open protocols and plenty of bandwidth, anything is possible. But intelligent assistance, in its most intimate form, will initially be delivered on the desktop and will be closely bound to it. As a result, we're likely to miss out on some interesting opportunities. When interaction data lives in the cloud, collaborative effects become possible. If you and I work closely together, for example, we might want our personal assistants to share our common vocabulary. A truly pervasive SOA (service-oriented architecture) would imagine and enable such scenarios.
Meanwhile, I'm not complaining. Watching these words appear as I speak them is pretty darned cool!