[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Origin of the Mel frequency scale equation?
Davis & Mermelstein (1980) say in their footnote:
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1163420
| Fant [ 8] compares Beranek's mel-frequency scale, Koenig's scale,
| and Fant's approximation to the mel-frequency scale. Since the differ-
| ences between these scales are not significant here, the mel-frequency
| scale should be understood as a linear frequency spacing below Hz
| and a logarithmic spacing above 1000 Hz.
where [8] is:
C. G. M. Fant, "Acoustic description and classification of pho-
netic units," Ericsson Technics, vol. 1, 1959; also G. Fant,
Speech Sounds and Features. Cambridge, MA: MIT Press, 1973,
pp. 32-83.
Searching on Beranek+mel, I find references to:
LL Beranek, Acoustic Measurements, Wiley, New York, 1949), p.329.
as the source for mel(f) = 1127 ln(1 + f/700)
This is the equation used in HTK's HSigP.c
(presumably as written by Steve Young in 1989), which
is probably the most widely-used mel calculation in the
world by data processed.
http://htk.eng.cam.ac.uk/
In a detailed study by Umesh, Cohen and Nelson published
at ICASSP'99, they cite O'Shaughnessy's 1987 book as the source
for mel(f) = 2595 log_10(1+f/700), which is the same in base 10.
Fitting the Mel Scale, S. Umesh, L. Cohen, D. Nelson,
ICASSP 1999 (Phoenix) , I-217-220.
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=758101
They also cite that Koenig first proposed a "split" approximation
that is linear below 1000 Hz and logarithmic above, adjusted
to have continuous slope at 1000 Hz.
W Koenig, "A new frequency scale for acoustic
measurements" Bell Telephone Laboratory Record,
vol. 27, pp. 299-301, 1949
This is the form used in Slaney's Matlab Auditory Toolbox (1993),
probably the second-most widely used version of the calculation.
http://cobweb.ecn.purdue.edu/~malcolm/interval/1998-010/
A couple of years ago I tried to implement a"universal" MFCC
calculation routine that could mimic the various other implementations
I knew of. It isn't perfect, but at least it makes explicit some of the
axes of variation.
http://labrosa.ee.columbia.edu/matlab/rastamat/mfccs.html
DAn.
p.s. here is a version of Slaney's equation, which maps the frequency
range 133 Hz to 6400 Hz to the range 0.0 to 40.0, with 1000 Hz mapping
to 13.0, and being linear below and logarithmic above that point.
mel(f) = { (f - f_0)/f_step for f <= f_b
{ m_b + ln(f/f_b)/m_step for f > f_b
where f_0 = 133.33, f_step = 66.67, f_b = 1000,
m_b = (f_b - f_0)/f_sp = 13.0 (by construction)
and m_step = ln(6.4)/27 - so the range from
1000 Hz to 6400 Hz accounts for the remaining
27.0 to take the scale up to 40.