Review of L&H Voice Xpress

Dateline: May 17, 1998

I'M back! Did you miss me, or shouldn't I ask?

Fact is, I suffered a two-week bout of what afflicts all writers (I believe) from time to time: burnout. Got fed up trying to find something new and useful to say or report, so instead went out and chopped a couple of half-dead trees, built a brick mailbox, and suchlike soul-salving stuff. Turned out to be good for my frozen shoulders, too, but that's another story.

Halfway through last week, the UPS truck (successfully avoiding my brick mailbox; which that august and imperious body, the Clinton County Road Commission, has since ordered me to remove because there's an ordinance prohibiting brick mailboxes, for Heaven's sake) trundled up my drive and deposited a package from Lernout & Hauspie (L&H). It was a review copy of L&H's just-released automatic speech recognition (ASR) program called Voice Xpress Plus, US$99 and available all over the place.

L&H, you may recall, is the Belgian ASR company of which Microsoft bought a slice (8 percent, if memory serves) for about US$45 million, or the small change in Bill's pocket, a year or so ago. L&H had just purchased Kurzweil, one of the originators and true innovators in the ASR field, and continues to sell specialized dictation products under the Kurzweil name.

I've previously expressed the fear that Microsoft will see a voice-driven computer interface as a threat to Windows' dominance of my desktop and yours, just as it saw a similar threat from Netscape's browser. Microsoft Research's alleged foray into ASR evidently led to nothing innovative, just a decision that they'd be better off buying than building the stuff themselves. By buying into L&H, not to mention by just being Microsoft, Microsoft will essentially control that company, and in my view will at some point seek to incorporate L&H's ASR algorithms directly into Microsoft's operating systems. This is the one fundamental reason why Microsoft so adamantly and vehemently opposes the U.S. federal and state justice system, which would stop Microsoft from integrating products into its operating system.

Integrating ASR with Windows is what I would also want to do if I were adamant about controlling every desktop in the world and did not, at heart, give a fig for innovation. But that would make it harder for Dragon Systems and IBM—its two major competitors in the ASR arena—to compete. And with Dragon and IBM out of the ASR picture, Microsoft would no longer have any motivation to spend heavily on improving the ASR. Indeed, it would have more incentive to curtail innovation, thereby cutting costs, thereby increasing profits.

But all this is speculation, and for the present it does not detract from the excellent work L&H and their Kurzweil colleagues have done with Voice Xpress and other products. From my limited experience with it so far, Voice Xpress promises to be at least as good as Dragon's NaturallySpeaking and IBM's ViaVoice. Unless and until those two programs encounter unfair competition, the end user can expect to benefit as all three go head to head to continually make their products better than the others.

Having said this much, I am going to have to leave you dangling for another week for a review of the program's performance under fire. It took me a couple of days to install the program properly, so I just didn't have time to give it the workout it needs for a thorough and valid review. What I will do in the remainder of this "Review Part 1" is list the program's features as claimed by L&H and describe my experiences during installation.

Features
Below is what L&H claims for its new product. My comments are in green text.

Installation Blues

I said earlier it took me a couple of days to install. Don't be alarmed. I had a system setup situation which most of you probably won't have, and this is what caused me grief. In the first place, I run WindowsNT Workstation 4.0, not the crash-prone Windows95 most of you will be afflicted with. (If you're thinking of upgrading to WIndows98 if and when it ever gets released, take my advice: don't! Switch to Windows NT WOrkstation instead.) Running Voice Xpress on NT vs. 95 should not make a difference, since Voice Xpress is designed for both operating systems, but who knows . . . .

In the second place—and this is where L&H's support staff suspect my problems lay—I had installed both Microsoft Office 95 and Office 97 on my machine. I can't remember if, when upgrading to Office 97, I uninstalled Office 95. I don't think I did—I think I assumed the upgrade would take care of eliminating the old stuff; but anyway, on going through the contents of my hard disk and the Registry there were still bits of the old version of Office lying around.

But none of this was apparent or suspected to start with. I inserted the Voice Xpress CD-ROM and installed the program with  ease. A warning message recommended downloading and installing a set of (free) bug fixes known as Service Release 1 for Microsoft Word 97 from Microsoft's Web site, saying that it would improve Voice Xpress's performance. Since it was a recommendation and not an imperative, I ignored it, and ran the program as soon as the installation was finished and I had rebooted the machine.

(Later on, in trying to solve the problem I'm about to describe, I went ahead and downloaded Service Release 1 for Office 97 (of which Word is a component)—about 2 megs or 20 minutes over the miserably slow and archaic GTE phone line and switch in my area, only to find that in order to install it, I also had to install a 20 meg bug patch for NT 4.0 called Service Patch 3. Sigh. I left the modem to download overnight and went to bed. The next day I had some serious spring cleaning to do on my C drive partition (which IBM in its peculiar notion of wisdom had set at a measly 500 megs out of 4 gigabytes available on the disk) in order just to be able to install these patches, but my machine is now bang up to date as far as NT and Office are concerned. But back to our story . . . )

The Voice Xpress Plus program group contains two ways of invoking the program: either with Microsoft Word or with the built-in XpressPad mini word processor. I went straight for the MS Word option, and while MS Word loaded OK, there was no sign of Voice Xpress. So I shut down Word and tried the XpressPad option. This worked fine: the mini word processor loaded, then Voice Xpress loaded on top of it, leaving its own menu bar at the top of the screen with controls for turning the microphone on and off, etc.

I went through a short and easy routine for calibrating my sound system and the (very light and easy to wear) headset microphone bundled in with Voice Xpress, then accepted the program's offer to take me on an excellently done five-minute tour of Voice Xpress's features and then on an interactive training session that quickly gets one up to speed on using commands to format text (bold this, italicize that, delete the previous paragraph, move the first paragraph to the end of the document, etc.)

Even without training to my voice, Voice Xpress was very good at recognizing formatting commands, though plain text dictation was messy, as one would expect before specific voice training (or what L&H call "enrollment"). One of the neat things about Voice Xpress versus the version (1.0) of Dragon NaturallySpeaking I reviewed a few months ago is that Voice Xpress lets multiple users use it (one at a time, of course; you can't just switch speakers in mid-session.) Each user has to go through the "enrollment" (training) process, and a separate profile of speech patterns is stored on disk for each named user. So after Mary has finished using it and shuts down the program, John can fire it up, select his own profile, and dictate to his heart's desire. (The latest versions of NaturallySpeaking and ViaVoice also have multi-user capability.)

The Voice Xpress enrollment process is very similar to that of NaturallySpeaking. It's easy to use, and takes about an hour. Whereas Dragon gives you the choice of a couple of long passages from books, L&H supply a whole bunch of witticisms, proverbs, and aphorisms that are actually quite fun to read.

So far, so good. The next step was to figure out how to make Voice Xpress work with Microsoft Word, as advertised. I tried loading Word after loading XpressPad, but Voice Xpress would still not work with Word, only with XpressPad. I called the L&H Help Desk. It took several longish (toll free) phone calls, with only one longish (maybe two minutes) wait, to get to the root of the problem.

The result of having bits of the old Microsoft Office 95 lying around on my hard disk was that Voice Xpress apparently got confused about where to copy the files needed to make it work together with Microsoft Word 97 (part of Office 97), and it simply did not copy them, nor did it give me any error message.

Before discovering this problem and its fix, however, we went through all sorts of gyrations. First, it was thought the problem might have been caused by my installing the program to the D partition on my hard drive, while the MS Word files were on the C partition. I had put Voice Xpress on D because it needs about 130 megs of disk space and my C partition only had about 50 megs free. Even clearing out old, unused files left only 90 megs free; still not enough.

L&H suggested moving Word over to D, so it would share the same partition with Voice Xpress. Tried that, and uninstalled/re-installed Voice Xpress from scratch—which meant I lost the file containing my voice patterns, painstakingly built through an hour of enrollment, and would have to go through it again.

But it still didn't work, and it was only when Paul and Jason, the two L&H support technicians I talked with, asked me to check for certain files that we discovered them missing from the crucial directory. Suddenly the problem became very simple, and the fix very fast: copy three small files from one directory to another. Bingo. Evetything worked great.

L&H's director of software development is onto the case, and you can be sure that: 1. The product really does work as advertised; 2. The installation problem I had was relatively obscure and is unlikely to affect most people; 3. L&H will fix the installation routine so people who've upgraded their versions of MS Word or MS Office won't encounter a problem at all in future releases; and 4. Until that fix is out, L&H technical support will know exactly what to tell you to do if you happen to call with this particular problem, and it'll only take two minutes to fix, not two days.

The only problem I have now is that the microphone recording volume seems overly sensitive, even though I have calibrated the volume almost to zero. The mic picks up all sorts of extraneous noises, like my dog padding along the wood and tile floor above me (I work in a basement home office, and my pitbull has long nails she won't let us clip—you don't argue with a pitbull—so she sounds a bit like a shoed horse walking on cobblestones.)  The mic hears this clip-clopping-along and Voice Xpress tries to interpret it, and while it is interpreting I cannot interrupt it, not even to turn the microphone off. So sometimes, when the dog is restless (i.e., it's suppertime), I seem to get stuck in the program, and have to invoke CTRL-ALT-DEL to shut it down—in which case I lose any training that has taken place since I started the session.

Like all ASR programs, Voice Xpress has to be finicky about microphone and sound card. When I reviewed Dragon NaturallySpeaking, I had to go and buy a Sound Blaster audio card because the el-cheapo card that came bundled with my old Compaq Presario gave lousy results. The Crystal sound card that came bundled with my IBM 300XL worked OK with NaturallySpeaking, but not quite so well with Voice Xpress. I have calibrated the mic several times, and each time I get a "better than average" rating from the Voice Xpress utility program, so it's hard to figure out what to do.

It's neither surprising nor unreasonable that ASR is finicky about audio recording capability. ASR is a very demanding task, and ASR programs need sharp ears; sharp in the sense of being able to distiguish between sounds intended for it and sounds that are purely extraneous. The microphone and sound card constitute the program's ears. One can only hope that the major computer manufacturers—and they certainly include Compaq and IBM, whose machines I own—will soon get the message that programs like Voice Xpress require and create a demand for good audio recording circuits, not just good playback circuits. All sound cards today have pretty good playback circuitry. Playing back sounds and music has been all the typical PC user wanted out of their sound system, and few people have used their PC's recording capabilities. That's about to change, and equipment manufacturers need to get the message.

I'll be talking more with the good folks at L&H about my sound problem, and I'll doubtless be doing more tweaking, so I hope to report progress next week, along with my detailed report of the results of using the program; it's accuracy, speed of learning, and so on. Perhaps we'll have to buy slippers for the dog.

In the meantime, I hope I have whetted your appetite with my first impressions, after only a few hours working with Voice Xpress. The program is as good as Dragon NaturallySpeaking was when I first ran it, in terms of recognizing my speech; and may actually be a little better and a little faster to learn and adapt to my speech pattern.

Where it shines right out of the box is in formatting and editing text. An AI sub-program is trained to recognize a variety of natural language commands that let you control Word (and XpressPad), and it works remarkably well. Editing is inevitable with all ASR programs, because none can achieve 100 percent accuracy in recognizing what you say, so the easier it is to edit the better. I found it a good deal easier and more natural to move around my documents and make edits with Voice Xpress than with NaturallySpeaking. Not only is Voice Xpress more tolerant of different ways I might say the same thing, but it can handle all the commands available as menu options in Word. So instead of having to go through several mouse clicks to create a table in a document, I simply say "Insert a 3 by 4 table here," and . . .
 
 
     
     
     
     
 

Voilà!

Don't miss next week's gripping conclusion.
 
 

  Until next week, 

 

NEXT WEEK: Part 2 of the above review.

Help Wanted: Got questions or comments on this article or on any other AI-related subject under the sun? Post it in the AIBB!

Previous Features