Re: Intermediate representation for music analysis (Bob Masta )


Subject: Re: Intermediate representation for music analysis
From:    Bob Masta  <audio@xxxxxxxx>
Date:    Mon, 17 Jul 2006 09:01:18 -0400

Note that no matter what sort of analysis you do, the frequency resolution is determined by the reciprocal of the analysis window duration. So if you want fine resolution for the low frequencies, you need a long sample set, even if you only need much coarser resolution at the high frequencies (due to the log nature of hearing). So, why not just take a long FFT? Even though they have linear frequency spacing, FFTs have been heavily optimized for efficient computation. I wonder if it might be better using a conventional FFT and lumping some upper bins together to form quasi-log bands, rather than using a less-efficient log-spaced filter bank. There is one weakness to that approach, however, in that if you set the overall FFT length so that the lowest band you want to handle is just exactly matched by the lowest FFT spectral line width, then the next spectral line will be at *twie* that... there will be no nice fractional-octave alignment. If you really need that, a log filter bank may be best. However, the way I have seen this handled is to assume (hope?) that there will be plenty of upper harmonics in the signal, many of which will fall into regions of the FFT where the resolution (considered on an octave basis) is much higher. By looking at a few of these upper harmonics, it was possible to figure out what the actual fundamental frequency was to similarly-high resolution. Best regards, Bob Masta audioATdaqartaDOTcom


This message came from the mail archive
http://www.auditory.org/postings/2006/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University