Arturo -
However, what I said about the equivalence of applying a summary autocorrelation to the output of a orthonormal linear filterbank and applying autocorrelation to the input signal is still true, it just does not apply to your model, because of the HWR.
That's true, but I wouldn't assume that people doing filterbank -> autocorrelation -> summary are simply misguided. Very often in addition to nonlinearity there is per-channel normalization (and/or weighting to emphasize particular subbands/sources), which has a profound effect. I used gammatones plus nonlinearity normalization in the pitch tracker in my CASA system, but I felt like I understood it better when I saw the way that Tolonen & Karjalainen simply whitened the spectrum (on a suitably coarse scale) then autocorrelated a low band (with HWR?) and a high band (with envelope extraction?) to avoid the need for a large number of parallel channels. I think their paper gets at a lot of the essence of many such pitch models.
DAn.
Tolonen, T. and M. Karjalainen (2000). A computationally efficient multipitch analysis model. IEEE Transactions on Speech and Audio Processing 8(6), 708â716.