Abstract:
This paper describes subband-cross-correlation (SBXCOR) analysis for robust speech recognition under noisy conditions. The SBXCOR analysis is an extended signal processing technique of subband-autocorrelation (SBCOR) analysis that extracts periodicities present in speech signals. In this paper, the performance of SBXCOR is investigated using a DTW word recognizer, under simulated acoustic conditions on a computer and a real environmental condition. Under the simulated condition, it is assumed that speech signals in each channel are perfectly synchronized while noises are not correlated. Consequently, the effective signal-to-noise ratio of the signal generated by simply summing the two signals is raised about 3 dB. In such an ideal case, it is shown that SBXCOR is less robust than SBCOR extracted from the two-channel-summed signal, but more robust than the conventional one-channel SBCOR. The resultant performance was much better than that of smoothed group delay spectrum and mel-frequency cepstral coefficient. In a real computer room, it is shown that SBXCOR is more robust than the two-channel-summed SBCOR. The microphone setup for SBXCOR will be also discussed.