4aSC7. Training section tolerant HMM in concatenated training.

Session: Thursday Morning, December 5


Author: Hiroshi Matsuo
Location: Mining College, Akita Univ., 1-1 Tegata Gakuen-machi, Akita-shi, 010 Japan
Author: Masaaki Ishigame
Location: Mining College, Akita Univ., 1-1 Tegata Gakuen-machi, Akita-shi, 010 Japan


Concatenated training of phoneme HMMs can use speech data without hand labels for training HMMs, but it has a tendency to decrease the recognition rate because of an improper training section. The results for speaker-independent isolated word recognition experiments showed that non left--right HMM is tolerant of the quality of the training section in concatenated training than the conventional left-to-right HMM. Non left--right HMM has a structure where state transitions within a phoneme are ergodic and state transitions between successive two phonemes are left-to-right. Non left-right HMM and the conventional left-to-right HMM show much the same performance as long as the training section is given properly by hand labels. Since the training section contains undesirable data, the recognition rate for both HMMs decreases, but the decrement of the recognition rate for non left-right HMM is smaller than the decrement for the conventional left-to-right HMM. Similar tendency was shown when the training section was determined by a phoneme duration model independently from the hand labels.

ASA 132nd meeting - Hawaii, December 1996