Dragon NaturallySpeaking Revisited
Dateline: August 24, 1997
THIS is my second review of the first desktop software for automatic speech recognition (ASR) capable of handling continuous speech. This time, I used a better sound card, and the review reflects that fact. But don't get too excited -- ASR is not completely out of the woods yet.
My main beef in the first review, you may recall, was that even for a two-finger typist like yours truly, it was quicker to type than to use NaturallySpeaking. The software made too many mistakes, and correcting them took too much time. Remember, the software can and will misunderstand you (Dragon reports "up to 95% accuracy," which I read as "more than 5% inaccuracy), but it it can't and doesn't misspell. If you say (as I did, in one session) "the grass on which the deer fed" but NaturallySpeaking thinks you said "across on which Padilla fed" (as it did), then your trusty spell checker will be totally useless (as was mine).
(Psst: Anybody know who Padilla is?!)
The software still makes mistakes, but not as many as the first time around. The difference is the sound card. It took me a while to get a response from Dragon Systems (Dragon please note: Your Help Desk needs help -- check out The Molloy Group's neural-net-based Top of Mind help desk software) to the simple question: "Among the half-dozen sound cards certified to work OK with your software, do any stand out as being better than the rest?" I asked because I wanted to give the software its best shot in the review. I already met the remaining parameters -- a minimum 133 MHz Pentium, at least 32 MB of RAM, and oodles of disk space.
The answer, when it came, was expensive. Creative Labs' SoundBlaster AWE64 Gold, weighing in at a discounted $199 at my local Best Buy store. But I bought it anyway, and spent an hour rummaging around the innards of my Compaq Presario installing it.
(This is somewhat tangential to the review, but not completely irrelevant: I spent that hour trying, without success, to uncouple the CD-ROM sound cable from its seat on the Compaq motherboard, which has an Ensoniq sound card chipset built in. The plug was barely reachable with my fingertips, and try as I might I could not unplug it. I think it must be glued or soldered on. I was tempted to have at it with some long-nosed pliers, but thought I'd better do the review first, before totally trashing the Compaq. I also discovered that the plug on my speaker wires would not fit the socket on the sound card. Result: I can dictate into the machine through the SoundBlaster and the microphone Dragon supplies, but I can't get an audible peep out of the machine. Messing around in Windows setup to try to get it to recognize both sound cards and use one for recording and one for playback resulted in all sorts of Gatesian error messages, so I gave up. I thought of calling someone, but whose responsibility was it? Dragon, Creative Labs, Ensoniq, Microsoft, and Compaq would probably point fingers at each other, and I'd be left stuck up the gum tree. For now, playback is not a problem, but I know I will want it some day and will have to get this sorted out.)
So anyway, I re-ran the software installation, as Dragon recommends after changing a sound card, and this time it reported that sound quality was satisfactory. Not blazingly good, mind you; merely satisfactory. But this was better than my old sound card, which gave "poor" quality results according to installation's sound card test routine.
I also re-ran the training session, where you read a 30-minute passage on screen into the microphone so NaturallySpeaking can learn how you pronounce the words and phrases in the passage. Last time, I chose a Dave Barry passage, which made me giggle as I was reading it, which tended to mess up the training. So this time I selected a passage from Arthur Clarke's sequel to 2001 called, surprise, 3001.
Whenever you dictate (not just during training), NaturallySpeaking adjusts its "speech file," a large file that contains information on your speech patterns. In this second installation of the software, the old speech file was kept, so I assume its contents were still useful even though my hardware had changed.
Finally, we got down to business, NaturallySpeaking and I. I dictated two long sets of hand-written notes I had made while reading Borges' Labyrinths and Dennett's Darwin's Dangerous Idea (on which more next week), amounting to some four hours in total. The result was decidedly better than it was in the pre-SoundBlaster epoch. It unquestionably made fewer mistakes.
But it still makes some mistakes. For your two-fingered host, the time required to type a long set of notes is now about the same as it takes to dictate and then correct them. The difference is that I don't need to spend nearly so much time flexing my wrists at the keyboard, a pretty important consideration because I have developed carpal tunnel syndrome (which is like a hard-disk crash -- you think it can never happen to you, but believe me it's painful when it does.)
I also hope and expect (Dragon's marketing literature says I should expect) that NaturallySpeaking will get better the more I use it. My sense is that the program as it is working for my voice today is already about 99 percent as good as it gets given my somewhat mumbly, British-accented speech, and that another percentage or two's improvement will not make a huge difference. One reason I think this is that, despite repeated training sessions, it continues to make mistakes on some very common words. It frequently puts in "the" where I say "a," and "and" where I say "an." These errors are really irksome after a while, and I find myself snarling into the microphone when making corrections, which hardly helps matters. (It's like snarling back at one's wife: communications only get worse, though in NaturallySpeaking's case, it will still love you in the morning.)
Dragon has just announced a version for Brits, and that presumably would make a difference. If Dragon is prepared to send me another review copy, then I'm game to give it another go. As things are, though, the bottom line is that Dragon NaturallySpeaking's American accent version is just, but only just, good enough to encourage me to continue to use it. And since it is almost the only continuous-speech ASR game in town, it is a good choice for anyone who can identify with me -- two fingers, carpal tunnel, and all.
I said "almost" the only game in town because, as of last week, it has some competition. IBM's ViaVoice has just hit the streets, and it claims to do what NaturallySpeaking does plus a bit more, and for a lot less money. The bit more is that it can repeat what you have dictated, out loud by means of a built in text-to-speech synthesizer. This might be a boon -- it might be easier to catch mistakes by listening than by reading (I know I miss quite a few errors when I first read over the text I have dictated) -- but I'd have to experience it to know how much of a boon. At $99 (intro price) ViaVoice is way cheaper than NaturallySpeaking's $699 list or even its roughly $300 street price.
ViaVoice still needs a high-end sound card, and it needs a machine even more powerful than required for NaturallySpeaking -- a minimum 166 MHz Pentium as opposed to 133 MHz for the Dragon software (and a Wall Street Journal reviewer reported the Dragon product worked OK on his 90 MHz machine).
One company attempting to spike both Dragon's and IBM's guns is Fonix, which claims to be close to perfecting and licensing an ASR program that will blow their socks off. Fonix claims: its product will be speaker independent, meaning it will require zero training to your voice; it will be more accurate, because it will have a much richer store of contextual knowledge; and it will impose a lighter hardware burden because it uses a more efficient algorithm than the statistic-sampling Markov Model used by its competitors. Note the future tense. The Fonix product is not out yet, and it may be some time before it is.
I'm writing both IBM and Fonix to request review copies of their software so I can review them here. We'll see what happens, and you can be sure I'll keep you informed.
Until
next week,

NEXT WEEK: Darwin's Dangerous Idea. Notes from my recent reading of philosopher Daniel Dennett's mighty tome on evolution and its relevance to AI.