Re: Robust method of fundamental frequency estimation. (Eckard Blumschein )


Subject: Re: Robust method of fundamental frequency estimation.
From:    Eckard Blumschein  <Eckard.Blumschein@xxxxxxxx>
Date:    Tue, 27 Feb 2007 08:37:02 +0100
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

Arturo Camacho <acamacho@xxxxxxxx> wrote: > > autocorrelation-based pitch models that can NOT be expressed in terms of the > > spectrum. > > For example, the Meddis & Hewitt or Meddis & O'Mard models, or > > Slaney & Lyon models, > > derived from Licklider's duplex theory, which do the ACF after what the > > cochlea model does, which is a separation into filter channels and a > If I am > not wrong, what Slaney & Lyon’s model does is to apply a summary > autocorrelation to the output of a gammatone filterbank (it does some > extra steps, but the main idea is that one). Since this can be shown to be > equivalent to applying autocorrelation to the original signal (use > Wiener–Khinchin theorem and linearity property of Fourier Transform), Roberto, Your are wrong in your guess that to apply a summary autocorrelation to the output of a filterbank is equivalent to applying autocorrelation to the original signal. According to the theorem you mentioned but perhaps not understood, autocorrelation corresponds to performing cosine transform twice, i.e. back and forth: A first cosine transform of a signal f_0(t) from time domain yields F_0(omega) in frequency domain. Subsequent second cosine transform of F_0(omega) yields a f_1(tau) in time domain again. These two steps together correspond to the autocorrelation function ACF of the o r i g i n a l signal: f_0-->f_1(tau). Remember: ACF corresponds to twice cosine transform, a first one and an inverting second one. Bogert and Tukey called that inverted spec_trum a ceps_trum, inverting the order of letters in the syllable spec into ceps. This f_1(tau) is what perhaps comes close to a major part of auditory function even if it is hard to abandon what we learned that we are hearing frequencies and admit that autocorrelation lag is largely equivalent to frequency. ACF of the spectrum F_0(omega) would correspond not to just two but to to three cosine transforms in series and eventually result in a function F_1 of omega: f_0(t)-->F_0(omega]-->f_1(tau)-->F_1(omega). Brain cannot directly process functions of omega. In cat, there are about 33,000 T-multipolar chopper neurons of the ventral cochlear nucleus (VCN). T means they immediately project to the IC via trapezoid body (TB). They might translate place code into downsampled frequencies while preserving tonotopy at a time. At least they show very regular responses with a highly reproducible pattern of spike trains in which the interspike intervals are all about the same length. Frequencies of chopper neurons are on average about three times lower than average frequencies of firing within single auditory nerve fibers which already tend to be considerably lower than each belonging characteristic frequency CF for CFs in excess of 500 Hz. Regards, Eckard Blumschein


This message came from the mail archive
http://www.auditory.org/postings/2007/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University