4aSC1. Spectral representation of speech using mel-generalized cepstral coefficients.

Session: Thursday Morning, December 5

Time: 11:35


Author: Kazuhito Koishida
Location: Precision and Intelligence Lab., Tokyo Inst. of Technol., 4259, Nagatsuta, Midori-ku, Yokohama, 226 Japan
Author: Keiichi Tokuda
Location: Nagoya Inst. of Technol., Gokiso-cho, Showa-ku, Nagoya, 466 Japan
Author: Takao Kobayashi
Location: Tokyo Inst. of Technol., Yokohama, 226 Japan
Author: Satoshi Imai
Location: Tokyo Inst. of Technol., Yokohama, 226 Japan

Abstract:

In mel-generalized cepstral analysis, the model spectrum can be varied continuously from AR to cepstral modeling by changing the value of a parameter (gamma) and choosing an appropriate model spectrum. Furthermore, the spectrum represented by mel-generalized cepstral coefficients has a frequency resolution similar to that of the human ear. Although it is expected that the mel-generalized cepstral coefficients are a useful parameter for speech coding and speech synthesis applications, the filter stability may become unstable after quantization of the mel-generalized cepstral coefficients. This paper presents a spectral representation of speech using mel-generalized cepstral coefficients. The filter stability can be easily ensured after quantization of the proposed spectral parameters. First, the statistical distribution of the proposed parameters is shown. Second, the computational complexities for calculating the proposed parameters from mel-generalized cepstral coefficients are investigated using a speech database. Finally, the quantization and interpolation properties of the proposed parameters are compared with those of LSP. Experimental results show that proposped parameters have better performance than LSP.


ASA 132nd meeting - Hawaii, December 1996