Voice computing has arrived

A couple of weeks ago I wrote I had bought a compact digital voice recorder for recording interviews and speeches.

I have found this very useful for many of the things that I do and, even better, the Olympus WS-110 DS came with a copy of Dragon NaturallySpeaking 9.5 speech recognition software which transcribes dictation into text files.

Eleven years ago, almost to the day, I was playing with speech recognition programs which were the grandfathers of this current version. Looking back on my articles written at the time, I thought the programs were pretty good stuff and I was convinced I’d keep using them.

I did do so for a couple of weeks but found that it was just too labour intensive to correct all the errors. I was hopeful that things would have improved since then and so I installed the package on my computer.

My version of NaturallySpeaking is known as the Recorder Version and needs you to record quite a sizeable chunk of text chosen from a number of different ones which are provided. Accuracy also improves as you use it and you make corrections to text it has transcribed.

Once I’d got the training out of the way, I eagerly plunged ahead to and began dictating that week’s column. I was extremely impressed with the high quality of the transcribed text even though it takes a bit of time to load and transcribe.

One drawback to the recorder edition is that you first have to make the recording and then load it into the program, before it can be transcribed. For this reason, I have been considering buying the full package which allows you to dictate directly into whatever program you happen to be working with.

Compatible programs for the full version include MS-Word, Firefox, Internet Explorer, and the program transcribes your speech, allowing you to make corrections as you go. You can also use voice commands for opening and saving files and navigating about on the Internet.

You can get NaturallySpeaking Preferred 10, for about R2800, and the Standard 10 version at about R1700. The Standard edition seems to do most things but it apparently won’t import files produced on a voice recorder, as the Preferred edition will.

Being in the media business, I was very interested to see if NaturallySpeaking could transcribe an interview but, unfortunately, it turns out that it can’t. It needs a good quality recording by the single voice that it has been trained to recognise, before it can operate.

NaturallySpeaking is very accurate and the major limitation to using it, in my opinion, is not how good the software is, but how well the user can learn to dictate. I’ve now been using the package for four weeks, and I’m still having a bit of difficulty in dictating accurately because you not only have to think of what you want to say next, but also remember to speak clearly and insert the punctuation.

Things are starting to come right and, although, I still get pretty tangled up, NaturallySpeaking doesn’t mind a silence while I’m thinking of what to say next. I often talk myself into a dead end and I have found that it’s easier and quicker just to say ‘new paragraph’ and start the paragraph again; the other can easily be deleted later.

Running words together is another source of inaccuracy because the program, no matter how clever, won’t know what you mean unless speak each word distinctly. You can pretty much talk as fast you like but each word needs to be distinct, which does take a bit of practice, but it does start coming right in time.

There are special versions of the program designed for the legal and medical professions and I guess that they would be very useful for those people. Details on those and the other versions on www.nuance.com. A comparison of the features in each version is available here.

I have now been NaturallySpeaking for the last four weeks for producing my weekly column, and other things, and I have to say that I’m really hooked on this method of working. I don’t think that I will be abandoning it any time soon, as I did 11 years ago.

