Abstract:
A modified LVQ2 algorithm (MLVQ2) for phoneme recognition is proposed. The length of a phoneme reference pattern made by the algorithm was fixed. It showed high phoneme recognition performances for isolated spoken words. However, it did not show high performances for continuous speech because the phoneme reference patterns with fixed length could not deal with a big change in phoneme duration. In this paper, a new construction algorithm of the reference patterns using discriminative training based on the MLVQ2 and DP matching is proposed. The initial multiple reference patterns are optimized as follows: At first, the correspondence between a frame of the input sample and a frame of the nearest reference pattern in the incorrect phoneme class is calculated by using the DP matching. Then, the corresponding frame vector of the reference pattern in the incorrect phoneme class is moved away from the frame vector of the input sample based on the MLVQ2. On the other hand, the corresponding frame vector of the reference pattern of the correct phoneme class is moved nearer to the frame vector of the input sample. The experimental result using the ASJ 503 sentences speech database showed a 81.3% phoneme recognition rate, increasing 1.7% compared to the previous method.