Abstract:
The adequacy with which source models describe pathological phonation has been little studied, and the perceptual importance of deviations from a perfect model fit remains unknown. To address these issues, pathological voices were selected randomly from a library of samples. Glottal source waveforms were estimated by inverse filtering airflow signals. Formant frequencies and bandwidths were estimated using combined LPC and spectrographic analyses. Source functions were synthesized by least-squares fitting a simplified LF model [Y. Qi and N. Bi, J. Acoust. Soc. Am. 96, 1182--1185 (1994)] to the glottal flow derivative. Three versions of each voice were created. The first was constructed by extracting one cycle from the original voice and repeating it to form a 1-s signal. The second was constructed by recombining the inverse filtered source and estimated vocal tract resonances. The third version comprised the LF-modeled source combined with the estimated vocal tract resonances. Expert listeners compared the three versions of each voice, and results were interpreted in terms of the specific differences among stimuli. This procedure allowed evaluation of the extent and perceptual importance of information lost first by inverse filtering and then by fitting the LF model to the resulting volume velocity signal. [Research supported by NIH.]