Re: Robust method of fundamental frequency estimation. ("Richard F. Lyon" )


Subject: Re: Robust method of fundamental frequency estimation.
From:    "Richard F. Lyon"  <DickLyon@xxxxxxxx>
Date:    Sat, 24 Feb 2007 10:33:02 -0800
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

Arturo, thanks for the pointer to your poster. I had to go back to this message to decode the different algorithnms you compared. Now, I have some questions. You say "Although some of these algorithms were initially proposed using a time-domain approach, all of them can also be formulated using the spectrum of the signal, and that is the approach we took." That may be true, but there are other good time-domain correlation-based pitch models that can NOT be expressed in terms of the spectrum. For example, the Meddis & Hewitt or Meddis & O'Mard models, or Slaney & Lyon models, derived from Licklider's duplex theory, which do the ACF after what the cochlea model does, which is a separation into filter channels and a half-wave rectification. Did you consider any such models? Have others reported their results on that speech database? I think that's really the competition if you have a new pitch model, especially if you want more generality beyond speech and music. Your poster says that the spectra were estimated using FFT, and the next sentence says using a gammatone filterbank. Which is it? Or both? Oh, I see, one says the algorithm and the other the model. Why would you choose an algorithm that doesn't match the model? Why treat these as conceptually different things? An algorithm is a computational model, is it not? Dick At 6:47 PM -0500 2/2/07, Arturo Camacho wrote: >Dear members, > >I just want to add two points to what Yi-Wen said: > >> Dear list, >> >> >> Just want to draw your attention to a good summary on various >> auto-correlation based pitch determination methods, >> >> Arturo Camacho and John G. Harris, "A biological inspired pitch >> determination algorithm", Fourth Joint Meeting of ASA and ASJ, Honolulu, >> Nov. 2006. >> >> Contact arturo@xxxxxxxx if interested. >> >> Best regards, >> Yi-Wen > >First, in that presentation we not only did a summary of pitch estimation >algorithms (PEA) but also pointed out some pitfalls they have. Second, we >did it not only for autocorrelation based algorithms, but also for many >other algorithms we considered to be “classical”. Although some of these >algorithms were initially proposed using a time-domain approach, all of >them can also be formulated using the spectrum of the signal, and that is >the approach we took. We expressed those algorithms as the selection of >the pitch candidate (PC) that maximizes an integral transform of a >function of the spectrum. > >Below is a summary of our findings. For each algorithm, we give a short >DESCRIPTION, then the FUNCTION applied to the spectrum, the KERNEL of the >integral transform, and finally a PROBLEM of the algorithm. Sometimes you >will find that the algorithm also have problems presented before or >problems that will be presented later. Notice that the order we present >the algorithms is such that each subsequent algorithm does not exhibit the >problem mentioned for the previous algorithm. A final note about >semantics, to make the writing short in the descriptions, when we say >spectrum we mean MAGNITUDE of the spectrum. > >HARMONIC PRODUCT SPECTRUM (HPS) >------------------------------- >DESCRIPTION: multiplies the spectrum at multiples of the PC, or >equivalently, adds the log of the spectrum at multiples of the PC. >FUNCTION: log >KERNEL: periodic sum of pulses >PROBLEM: If any harmonic of the pitch is missing, the log is minus >infinity and therefore the integral is also minus infinity. > >SUB-HARMONIC SUMMATION (SHS) >---------------------------- >DESCRIPTION: adds the spectrum at multiples of the PC. >FUNCTION: none >KERNEL: periodic sum of pulses >PROBLEM: Any subharmonic of the pitch has the same score as the pitch. > >SUB-HARMONIC SUMMATION with decay >--------------------------------- >DESCRIPTION: Same as SHS but uses a decaying factor to give less weight to >high order harmonics. >FUNCTION: none >KERNEL: decaying periodic sum of pulses >PROBLEM: The same score it produces for a pulse train at the pitch is >produced for white noise at each PC. Therefore, not only it produces an >infinite number of pitch estimates for white noise but also they have the >same strength as a pulse train. > >SUBHARMONIC-TO-HARMONIC RATIO (SHR) >----------------------------------- >DESCRIPTION: Same as SHS but subtracts the spectrum at the middle points >between harmonics. Uses log spectrum, though. >FUNCTION: log >KERNEL: periodic sum of positive pulses plus half-period-shifted sum of >negative pulses >PROBLEM: Like all the algorithms presented above, it does not work for >inharmonic signals > >HARMONIC SIEVE (HS) >------------------- >DESCRIPTION: Same as SHS but instead of pulses it uses rectangles >FUNCTION: none >KERNEL: sum of rectangles >PROBLEM: weighting applied to spectrum is too sharp. A slight shift in a >component may take it in or out of the rectangle, possibly changing the >estimated pitch drastically. > >CEPSTRUM (CEP) >------------- >DESCRIPTION: Same as SHR but instead of pulses uses a cosine to transition >from 1 to -1. >FUNCTION: log >KERNEL: cosine >PROBLEM: uses the log (see HPS) > >UNBIASED AUTOCORRELATION (UAC) >------------------------------ >DESCRIPTION: Same as CEP but squares the spectrum >FUNCTION: square >KERNEL: cosine >PROBLEM: If signal is periodic then UAC is also periodic. Therefore there >are infinite number of maximums. Taking the first local maximum (excluding >maximum at zero) does not work either. Try a signal with first four >harmonics with magnitudes 1,6,1,1. At high enough levels its pitch >corresponds to the fundamental frequency, however, the first maximum in >the UAC corresponds to the second harmonic. > >BIASED AUTOCORRELATION (BAC) >------------------------------ >DESCRIPTION: Same as UAC but a bias is applied such that a weight of one >is applied to a period of 0 and decays linearly to zero for a period T, >where T is the size of the window. >FUNCTION: square >KERNEL: cosine >PROBLEM: Like UAC, the squaring of the spectrum gives to much emphasis to >salient harmonics. This feature combined with the bias may cause problems. >For example, for the 1,6,1,1 signal, the bias can make the score of the >second harmonic higher than the score of the fundamental (take for example >the fundamental period as T/4) > >END OF LIST >=========== > >In ISCAS 2007 we will be presenting an algorithm that avoids the problems >presented here. It will be published in the proceedings of the conference. >>From the order we presented here the algorithms it is easy to infer what >the algorithm looks like. > >Arturo > >-- >__________________________________________________ > >Arturo Camacho >PhD Student >Computer and Information Science and Engineering >University of Florida > >E-mail: acamacho@xxxxxxxx >Web page: www.cise.ufl.edu/~acamacho >__________________________________________________


This message came from the mail archive
http://www.auditory.org/postings/2007/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University