4pSC15. Speech recognition based on subword units.

Session: Thursday Afternoon, December 5

Time:


Author: Takuya Noizumi
Location: Dept. of Information Sci., Fukui Univ., 3-9-1 Bunkyo, Fukui, 910 Japan
Author: Mikio Mori
Location: Dept. of Information Sci., Fukui Univ., 3-9-1 Bunkyo, Fukui, 910 Japan
Author: Shuji Taniguchi
Location: Dept. of Information Sci., Fukui Univ., 3-9-1 Bunkyo, Fukui, 910 Japan

Abstract:

Large vocabulary, isolated word recognition requires a large amount of training data proportional to the vocabulary size to characterize each individual word model. A subword-unit-based approach is a more viable alternative than the word-based approach to overcome the problem of the training data size, since different words can share common segments in their representations in the former. This paper deals with a couple of isolated word recognition systems where the subword-unit-based approach is commonly employed, though their methods of segmentation are completely different. In one system a hidden Markov model is used to decompose a word into subword units (segments), and frequency spectra of those subword units are fed to a recurrent neural network to yield a subword code sequence for the word. This sequence is then recognized hopefully as the original word by a set of hidden Markov models for isolated words. In the other system subword boundaries within a word are detected by finding peaks of the delta cepstrum of the word, and the resulting sequence of subwords is deciphered into the original word by means of concatenated hidden Markov models of isolated words. Those systems attain average recognition accuracies over 92%--96%.


ASA 132nd meeting - Hawaii, December 1996