Abstract:
In manual cued speech (MCS) a speaker produces hand gestures to resolve ambiguities among speech elements that are often confused by speechreaders. The shape of the hand distinguishes among consonants; the position of the hand relative to the face among vowels. Experienced receivers of MCS achieve nearly perfect reception of everyday connected speech. MCS has been taught to very young deaf children and greatly facilitates language learning, communication, and general education. A system that can produce cued speech automatically in real time is currently being developed at MIT. Cues are derived by a speaker-dependent HMM speech recognizer that uses context-dependent phone models and presented visually by superimposing animated handshapes on the face of the talker. The benefits provided by these cues strongly depends on articulation of hand movements and on precise synchronization of the actions of the hands and the face. Cue receivers experienced in the reception of MCS can recognize roughly two-thirds of the keywords in cued low-context sentences correctly, compared to roughly one-third by speechreading alone. Ongoing development aims at increasing the accuracy and robustness of the cue recognizer, refining the display, and simplifying the hardware required for use in clasroom settings. [Work supported by NIH.]