Re: Frequency to Mel Formula ("Richard F. Lyon" )


Subject: Re: Frequency to Mel Formula
From:    "Richard F. Lyon"  <DickLyon@xxxxxxxx>
Date:    Mon, 10 Aug 2009 22:04:06 -0700
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

With respect to Umesh ("Fitting the Mel Scale", 1999), I hadn't actually got hold of his paper until just now; sure enough, he compared all the same fits, but started with a different table, from Stevens and Volkman. Here are the Stevens and Volkman numbers: f_stevens = [40; 161; 200; 404; 693; 867; 1000; 2022; 3000; 3393; 4109; 5526; 6500; 7743; 12000] mel_beranek = [43; 257; 300; 514; 771; 928; 1000; 1542; 2000; 2142; 2314; 2600; 2771; 2914; 3229; Here are the Fant numbers that I used: % Baranek's tabulated data that Fant said fit log(1 + f/1000): f_baranek = [20; 160; 394; 670; 1000; 1420; 1900; 2450; 3120; 4000; 5100; 6600; 9000; 14000]; mel_beranek = (0:250:3250)'; I've added the Stevens table points on the svg plot at http://dicklyon.com/tech/Hearing/Mel-like_scales.svg The Umesh curve is closer to they data they fitted, naturally. Looks like the Fant numbers are indeed from Beranek: http://books.google.com/books?id=yCsLAAAAMAAJ&q=mel+inauthor:beranek&dq=mel+inauthor:beranek&lr=&as_drrb_is=b&as_minm_is=0&as_miny_is=&as_maxm_is=0&as_maxy_is=1950&as_brr=0&ei=FbGASsuGFZuOkQTylZStCg and http://books.google.com/books?id=WKM8AAAAIAAJ&q=3450+inauthor:beranek&dq=3450+inauthor:beranek&lr=&as_brr=0&ei=SLGASraCI6KKkASh0OivCg Jim Beauchamp kindly asked the right questions that helped me clarify this. Dick >Don, > >Thanks again for your great explanations of this complicated stuff. > >All that notwithstanding, I'm still poking around at why we have >these two different mel scales, with breaks at 700 and 1000. So I >got hold of Fant's book, which has Baranek's data table in it, and >plotted up some comparisons. > >See http://dicklyon.com/tech/Hearing/Mel-like_scales.svg > >The "Mel 1000" curve comes pretty close to the Baranek table data up >through about 4 kHz, then diverges far from it above that. The "Mel >700" curve misses pretty badly around 2-6 kHz, but fits better on >average if you count the highest frequencies. > >The "Umesh" curve, f / (0.741 + 0.00024*f), doesn't fit particularly >well, but has a good shape, so I did a "fit" and got f / (0.759 + >0.000252*f). > >I also did a mel-type fit, and found a broad optimum for the corner >around 711.5 Hz (under the constraint that 1000 Hz maps to 1000, >which I should probably have tried relaxing, but didn't). > >Anyway, here's my theory: Fant fitted to the frequency range he >cared about, which probably only went to 4 kHz or so. And then >someone else probably did a fit to the same Baranek table over the >whole range, and got the 700 number (the plot shows that the 711.5 >point are pretty much right on the 700 curve). And that's why we >see Baranek referenced so much, maybe? > >I also looked at goodness of fit (sum squared error in mel space) >including all the frequencies in the Fant/Baranek table. It turns >out that the Umesh type fit has only 1/8 as much error as the >mel-like fit, due to the Bark-like curvature at the high-frequency >end. > >So for people who like Baranek's table (assuming Fant has a true >copy of it), the Umesh type function should be a win. But I don't >think that function extends well to the larger log-like range that >we find in the ERB and Greenwood type curves, which are the ones >that make more sense in auditory-based applications. > >That's my theory and I'm sticking to it. > >Dick


This message came from the mail archive
http://www.auditory.org/postings/2009/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University