Abstract:
An important and subjective scale to measure the quality of speech is the syllable articulation score (AS). In the calculation of AS, all hearing errors are counted by the equal weight. Confusion usually occurs between phonemes having common characteristics, but confusion between quite different phonemes is observed when synthetic voice or an exotic voice isassessed. This paper proposes a three-dimensional articulatory model of consonants. The position of phonemes is assigned considering the place and manner of articulation. The weighted syllable articulation score (WAS) is estimated utilizing the distance between the phonemes uttered and the responded in this model. WAS of speech processed with HPF (high-pass filter) is 7% lower than that with LPF (low-pass filter), although their ASs are almost the same. Helium speech is quite unnatural and less intelligible than general speech. It is shown that the WAS of helium speech is low in comparison with the WAS of speech processed with LPF or HPF, although their ASs are almost common. Assessments by WAS of synthetic voice and various distorted speech will be presented. The final goal of this study is to make clear the relation between WAS and other subjective measures such as MOS (mean opinion score).