![]() |
![]() |
|
Update on Continuous
Speech More information is available in the article, An Overview of Speech Recognition, Winter, 2002 posted in Resource Lab (http://www.edc.org/spk2wrt/lab.html) The current situation with speech recognition. Generally, when people today speak of "speech recognition" they are referring to continuous speech recognition, so named because of its ability to interpret continuous speech - in other words, speech as we use it for talking. Continuous speech recognition first appeared commercially in 1997 and has been characterized by frequent changes in software versions. The primary products now available (Winter, 2002) are various editions of Dragon NaturallySpeaking (now owned by Scansoft) and IBM ViaVoice. Dragon products operate in Windows only, while ViaVoice offers versions for both Windows and the Macintosh. There is also a new (2001) Macintosh-only product, iListen, by MacSpeech. (The older speech recognition technology, discrete speech recognition, represented by DragonDictate, is less readily available now but is still preferable for some individuals - see below). Product information can be obtained at the following sites:
The promise and the reality. Continuous speech recognition is very appealing because, based on its claims, one can speak to the computer in a natural voice and at a normal rate of speech. This is a fine example of marketing hyperbole that glosses over important aspects of how one learns to use the software. Failure to understand the training and usage requirements of speech recognition products often leads to unsuccessful experiences with this technology. Additional unsuccessful experiences are not what you want for your students with writing disabilities. On the other hand, with the proper introduction to and training on the technology, and with reasonable expectations, speech recognition software can operate with remarkable ease and accuracy, and can be a tremendous boon to some students who might otherwise never have a successful writing experience. I will discuss the introduction and training needs below. Our recommendations. There are important differences for individuals and especially children and students with disabilities in the operation of the various continuous speech products, so knowledge of these can be important. A few pointers are listed below. All other things being equal, Dragon NaturallySpeaking Preferred is our recommended system for students with LD in schools. The Preferred version is recommended over less expensive versions of NaturallySpeaking because it includes several features that can be of particular benefit to students with learning disabilities: digital playback of the user's voice (what the user actually said) and synthesized speech readback of the text (what the software actually put on the page). Students with reading difficulties can use these features to check for errors. All versions of NaturallySpeaking also include other features that can be important for students with LD: dynamic updating of the words in the correction window, easy management of the correction window, and presence of an arrow to signal location (like a bouncing ball) during digital readback. NaturallySpeaking versions 4 and above include voice models for adolescents (from a discontinued product, NaturallySpeaking for Teens). According to most reviews (and the author's personal experience) IBM ViaVoice is equal to the Dragon products in accuracy for most adult and adolescent users. In our experience, it has even been slightly more accurate than Dragon products during initial usage with several adolescent users, but with proper training these differences became insignificant. ViaVoice also includes synthesized speech readback of the text to see what the software actually put on the page, but has more limited digital readback of the speaker's voice. On the whole, we believe that the interface is somewhat less effective than that of the Dragon products for some students with LD, as management of the correction window and cursor location seem more intrusive. These characteristics might not be a problem for many users. Certainly, if students need a Macintosh product, ViaVoice is a fine alternative. MacSpeech iListen is a Macintosh-only alternative. The author has only used an early release version that seemed promising, but was not a complete product. Several members of the Speak to Write listserv (www.edc.org/spk2wrt/hypermail) report that the product performed well with middle-school and older students, and compared it favorably to ViaVoice for the Macintosh. It is not clear whether the final version has text-to-speech readback or not. Introduction and training. Learning to use speech recognition effectively requires four things.
The most common complaint encountered in speech recognition implementation is that the software "just doesn't work" when the student talks to it. Digging below the surface of such complaints often demonstrates that the student and adult supporter have not really understood some of the basic requirements for training the software and have expectations of ease that exceed the capacities of the technology. Users must remember that the computer does not have any comprehension of language but is operating behind the scenes through a series of mathematical algorithms. In the initial use of the software the user is developing a "voice file." This voice file is the software's matching of the individual user's voice to the algorithms that it already has. It is essential that the user learn both how to speak to the software to maximize its understanding (#2 above), and also how to make corrections the proper way to provide the software voice file with useful information about its recognition errors (#3). Proper training and patience is often the solution. For successful implementation of speech recognition, students, staff, and schools have specific responsibilities. Schools must provide:
This article was written February, 2002 by Dr. Follansbee
This Web site was funded from 1997-2001 by the U.S. Department of Education, National Institute on Disability and Rehabilitation Research (NIDRR). Contract #HI33G70143. The views expressed within this site do not necessarily reflect the views of the Government. Site hosted by Education Development Center, Inc. ©2000 Education Development Center, Inc. All Rights Reserved.
|