Abstract:
A new criterion for measuring the quality of foreign-accented speech is proposed. The proposed criterion uses both the absolute and relative quality scores produced by an HMM speech recognizer. An absolute scoring function is determined by the difference of likelihood distributions between the correct phone HMM and an incorrect phone HMM for a speech segment. A relative scoring function is established to discriminate a delta-self value distribution from a delta-sub value distribution. A delta-self value is obtained by the likelihood score difference between the correct model and an incorrect model for a speech segment. A delta-sub value is obtained by the likelihood score difference between two incorrect phone models against the speech segment. The new idea is motivated by the following evidences. It is sometimes found that the absolute score is low, even if the speech segment sounds correct. In that case, however, the relative score is still high for the speech segment. This may happen because the HMM parameters were not chosen for evaluating voice quality. The best speech quality meets a condition that both the absolute and the relative scores are high. A practical integrated measure is the linear combination of the absolute and the relative scoring functions.