Over the past 10 years I have become an avid user of voice recognition software. It’s common for people passing by my office to see me with a headset on, dictating email or composing documents. When they enter the room I reflexively say “go to sleep”, the magic invocation that makes Dragon Naturally Speaking turn off its microphone.
My choice of voice recognition software started with a bout of tendinitis from, cough, excessive coding and what some may term, cough again, workaholism. I didn’t take any breaks and I was working 14 hour days behind the keyboard and inhaling coffee to keep me awake. This was back in Silicon Valley and the days were heady as we pushed to our next release.
A visit to the occupational specialist ensued and after some rather ineffective, non evidence-based treatment, I returned to my desk and continued to slog away at the keyboard in pain.
One day after typing for several hours and feeling decidedly worn down I found myself ruminating about how wonderful it would be if I could simply dictate the code that I was writing. This was back in 2000 when voice-recognition wasn’t a particularly effective technology except in the most limited scenarios. Fortunately, I was in an office which limited background noise, and I had the means and know-how to procure and install a high quality headset.
I still remember how disappointed I was the first time I tried Dragon NaturallySpeaking. It took the better part of five minutes to get through the early microphone testing, 15 minutes to train it, and the recognition was just atrocious.
I persevered. Even after using the software for a short period of time it became apparent to me that I was being trained as much as I was training it. I had to learn to speak slowly, to not look at the screen at all, to really think about what I was trying to say in complete sentences before I let the first words come out of my mouth. Most importantly, as all good writers will tell you, it’s incredibly important to not edit the text as it lays onto the page. That way lies the madness and incessant scrabbling of writers block.
Today Dragon NaturallySpeaking supports multiple voice models that cater to my Australian accent. It works out of the box immediately in almost all environments, and can be expected to achieve a routine accuracy around 99%, providing you have taken the time to be trained by it as much as it has taken the time to be trained by you.
For folks who are severely impaired that use voice recognition to limp their way around the computer, performing mouse clicks, window resizing, and a variety of other functions by voice – I feel for you. While this functionality exists, I absolutely hate it, and I find it makes my pace of work drag to an absolute crawl.
The money shot in voice-recognition is that you can dictate long complex sentences quickly and accurately just as you would write them. In time it becomes more natural to dictate than to type and the physical act of having to hit keys is itself a distraction.
I do know of people who write computer code using voice recognition. Personally I can’t. It’s just too complex to get such a syntactically rich set of thoughts onto “paper”, and I have been programming so long that I don’t know that I could change these habits if I tried. Coding has become inherently tactile and I lack the neuroplasticity to change that.
As voice-recognition becomes more mainstream in the form of Siri and Cortana the technology continues to improve. All of those voice data samples are being sent to the cloud together with any corrections that you make in your messaging app. It’s hard to overstate the effect that this will have on our futures. When you mix cloud computing with a massive training corpus the results are sure to be impressive.
This article was dictated using Dragon Naturally Speaking.