ASA 127th Meeting M.I.T. 1994 June 6-10

3pSP18. Skill acquisition, coarticulation, and rate effects in a neural network model of speech production.

Frank H. Guenther

Dept. of Cognitive and Neural Systems, Boston Univ., 111 Cummington St., Rm. 244, Boston, MA 02215

This work describes a neural network model of speech motor skill acquisition and speech production that explains a wide range of data on contextual variability, motor equivalence, coarticulation, and speaking rate effects. Model parameters are learned during a babbling phase. To explain how infants learn phoneme-specific and language-specific limits on acceptable articulatory variability, the learned speech sound targets take the form of regions, or convex hulls, in orosensory coordinates. This leads to an explanation of coarticulation wherein the target for a speech sound is reduced in size based on context to provide a more efficient sequence of articulator movements. Furthermore, reduction of target size for better accuracy during slower speech (in accordance with Fitt's law) leads to differential effects for vowels and consonants, as seen in speaking rate experiments that were previously explained by positing separate control processes for the two sound classes. The babbling process also naturally accounts for the formation of coordinative structures, or groups of articulator movements marshalled together to perform orosensory tasks. Coordinative structures provide motor equivalence, including automatic compensation to perturbations or constraints on the articulators. Computer simulations verify the model's motor equivalence, coarticulation, and speaking rate properties. [Work partially supported by AFOSR F49620-92-J-0499.]