Abstract:
A new pitch extraction method named ACLOS (AutoCorrelation of LOg Spectrum) [Kunieda et al., ICASSP-96, pp. 232--235 (1996)] is proposed. In this method, the fundamental frequency is estimated from the maximuum peak of the autocorrelation function of the log spectrum. ACLOS gives more robust and reasonable pitch information than other methods give. Quantitative evaluation of pitch extraction, however, has not been carried out. This paper presents a comparative performance of pitch extraction by ACLOS, the autocorrelation method, and the cepstrum method based on the assessment method employed by Rabiner [Rabiner et al., IEEE Trans. Acoust. Speech Signal Process. ASSP-24, 399--418 (1976)]. In order to increase the robustness for noisy speech, ACLOS is modified. Clipping of the log spectrum before calculating the autocorrelation function is introduced, because noise remarkably affects the valley of the log spectrum in low SNR. Experimental results reveal that ACLOS can reduce the gross pitch error even if the speech signal is degraded by noise. Conclusively, it is indicated that ACLOS will be a powerful tool for pitch extraction of noisy speech.