[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: MFCC method
On Fri, 9 Jan 2009, Richard F. Lyon wrote:
> and the observations of others that the principal components
> of a collection of vowel spectra on a warped frequency scale aren't
> so far from the cosine basis functions.
The warping of the frequency axis indeed invalidates the original
motivation of cepstrum calculation: the deconvolution of pitch and the
spectral envelope. Unfortunately, this is usually not emphasized in
textboks. Furthermore, the conventional MFCC computation algorithm
contains a (weighted) summation of spectral bands, which pretty much does
the smoothing as well. So I think that what makes the cosine transform (or
FFT) step practically useful is that it approximates a principal component
analysis (as Dick Lyon said) -- and that it decorrelates the features.
This is important because the MFCC features are in most cases modelled by
Gaussians with diagonal covariance matrices.
Laszlo Toth
Hungarian Academy of Sciences *
Research Group on Artificial Intelligence * "Failure only begins
e-mail: tothl@xxxxxxxxxxxxxxx * when you stop trying"
http://www.inf.u-szeged.hu/~tothl *