Subject: Re: MFCC method From: Laszlo Toth <tothl@xxxxxxxx> Date: Sat, 10 Jan 2009 15:50:59 +0100 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>On Fri, 9 Jan 2009, Richard F. Lyon wrote: > and the observations of others that the principal components > of a collection of vowel spectra on a warped frequency scale aren't > so far from the cosine basis functions. The warping of the frequency axis indeed invalidates the original motivation of cepstrum calculation: the deconvolution of pitch and the spectral envelope. Unfortunately, this is usually not emphasized in textboks. Furthermore, the conventional MFCC computation algorithm contains a (weighted) summation of spectral bands, which pretty much does the smoothing as well. So I think that what makes the cosine transform (or FFT) step practically useful is that it approximates a principal component analysis (as Dick Lyon said) -- and that it decorrelates the features. This is important because the MFCC features are in most cases modelled by Gaussians with diagonal covariance matrices. Laszlo Toth Hungarian Academy of Sciences * Research Group on Artificial Intelligence * "Failure only begins e-mail: tothl@xxxxxxxx * when you stop trying" http://www.inf.u-szeged.hu/~tothl *