Re: MFCC method ("Richard F. Lyon" )


Subject: Re: MFCC method
From:    "Richard F. Lyon"  <DickLyon@xxxxxxxx>
Date:    Fri, 9 Jan 2009 23:18:49 -0800
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

Arturo, You are right that the normal cepstrum is partly motivated by the periodic frequency spectrum ripples than come from the pitch harmonics. There's not a lot of logic to using a Fourier or cosine transform on a warped frequency scale if you're looking for those pitch ripples. The real "logic", in retrospect at least, is the observation of Pols that the principle components capture most of the variance using a few smooth basis functions, smoothing away the pitch ripples; and the observations of others that the principal components of a collection of vowel spectra on a warped frequency scale aren't so far from the cosine basis functions. Far be it from me to defend MFCC as a sound representation. But if what you care about is smoothed short-time power spectrum without much pitch effect, it's not bad. Dick At 8:18 PM -0800 1/9/09, Arturo Camacho wrote: >Actually, I do not find much logic behind taking the Fourier transform >(FT) of a log-amplitude spectrum transformed to a (quasi) logarithmic >scale, as done in MFCC. It is reasonable to take the FT of a >log-amplitude spectrum in the linear frequency scale (standard >cepstrum analysis) because this spectrum is often almost periodic (at >least for most naturally-occurring periodic signals). However, after a >(quasi-) logarithmic frequency scale transformation, I would rarely >expect the spectrum to be periodic (it will stretch as the frequency >increases), and therefore I do not find the logic behind trying to >represent it as a linear combination of sinusoids, as done implicitly >when taking a FT. > >Arturo


This message came from the mail archive
http://www.auditory.org/postings/2009/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University