[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Time and Space

Dear Peter and Alain

On Tue, 10 Jun 97 02:07:26 EDT Peter Cariani wrote

> I looked up your papers again in the IMPC proceedings. I do
> remember thinking that the IC modulation tunings were way
> sharper than anything I had seen in the literature, looking at
> the first paper (fig. 2) there are multiple, sharp peaks
> for Fm's of 200, 400, and 800 Hz.

Note that this figure is showing the output of a population of
modulation filters summed across  cochlear channels - a
summary modulation spectrum (SMS). Figure 2g shows the SMS for
a complex tone with a fundamental of 200 Hz. It would be
extremely worrying if it *did'nt* show peaks at 400, 600 and

> Looking at Fig. 2 in the second
> paper -- the one on pitch shift -- you get an estimated
> pitch of 195 Hz for the perfectly harmonic case (800,1000,
> 1200). For the shifted case, (860, 1060, 1260), de Boer's
> rule would predict pitches at 215 and 172 Hz, whereas
> your plot shows a global maximum at 205 Hz. The peak at
> 185 Hz is yet smaller than one at 195 Hz. What psycho-
> physical results were these estimated pitches compared
> against?

According to be Boer's data (1956) for *five* component
inharmonic (taken from Plomp (1976) Aspects of Tone Senation.
Academic Press)

Centre Component        Lower pitch             Upper pitch

1060                            -               215
1100                            182             220
1160                            195             -

Now remember I used a very crude quantization of 5 Hz steps so
there was always going to be some error, but nevertheless if
you look at figure 2 in terms of the centroids of the peaks
then my results for *three* components are approximately

Centre Component        Lower pitch             Upper pitch

1060                            -               210
1100                            185             215
1160                            195             -

So the picture really is not as bad as you make out. These
results I generated with one attempt, one particular training
set of 10 harmonic complexes with  equal weight components. I
could have messed around with training sets of different
timbres, different numbers of harmonics, etc. etc. I could have
tweaked around with a whole load of different parameters which
would have given me slightly different numbers. But I chose not
to do so because it seemed clear to me that there was a
definite effect which was reasonably close to the data. What I
had demonstrated was that harmonic series pattern matching of a
*time-place* representation could show the right kind of pitch

> But for this stimulus (900, 1100, 1300) I would expect
> v. weak low pitches: a pitch at the true fundamental
> (100 Hz), and fainter ambiguous pitches around 185 Hz and 222 Hz.
> Your model doesn't predict any of these discrete pitches--
> the estimates have a broad, high-pass character starting
> with about 190 Hz. The model doesn't seem to work......

In figure 3 which you are referring to, I did *not* show the
output of the pattern matching component, but the SMS. What this
showed was that although the modulation spectrum only responds
strongly to first order intervals, it does in fact also show
some weak sensitivity to fine structure. In this case I
focussed around the 200 Hz area and showed that there was a
shift upwards in the SMS alone. If I had in fact  combined this
with the harmonic series patttern matcher  it would have
shifted upwards the pitch estimates closer to the experimental

The reason that you did not see anything around 100 Hz was
because I did not  *look* at this range.
However, it may interest you to note the

"It is remarkable that the complex 900+1100+1300+1500+1700 Hz,
consisting of odd harmonics of a fundamental of 100 Hz, did
*not* [my italics] have a pitch corresponding to that
frequency, though be Boer explicitly searched for it." [Plomp.
(1976) Aspects. p 119.]

To return to the main point though, my conclusion was that a
full acount of pitch shift required a hybrid of two mechanisms
(a) a low-level temporal  mechanism and (b) a central pattern
matching matching mechanism. According to Plomp on  p. 120

 "De Boer proved mathematically that both approaches are
equivalent. He explained that small deviations of the data
points from the theoretical lines by introducing a weighting
factor in favour of the lower partials. De Boer suggested,
although this has been overlooked by later investigators, that
*both* [his italics] mechanisms may play a part in pitch
sensation, the *spectral* [his italics]  one for low harmonic
numbers and the *temporal* [his italics] one for high harmonic

I agree with de Boer. My only difference is that the central
matching is carried out on a spectro-temporal representation,
rather than simply a spectral one.

> All I am suggesting, as gently as possible, is that you should
> pay more attention to the nature of the elements that are
> supposed to be carrying out your informational operations --
> just check to see if they in any way resemble what is seen
> physiologically.

On  p.177 of the ICMPC96 procs. I wrote

"There are many aspects of the model which are unrealistic.
Perhaps the most obvious is the use of  linear filters for cell
modulation response properties in contrast to the known
non-linear behaviour of auditory neurones (Hewitt et al, 1992).
 The only reason to favour  the  linear approach over a full
Hodgkin-Huxley model is one of computational tractability.
Even as it stands, the model typically uses 32 cochlear
channels, about 1000 ICC cells, about 1000 MGB cells and 30,
000 cortical cells. This is approximately the correct ratio for
the auditory system,  although the actual number is out by a
factor of 1000.  The model thus runs at the limits of
conventional computing.  Despite, these any many other
limitations, the following three papers (Todd, these
procedings, a,b,c) examine the how the interaction of the
central processes may provide an account for some
psychophysical phenomena of pitch, time and auditory grouping."

I think it is clear from the quote above that I am aware of the
physiological limitations of my model. However, I am also aware
of the computational impossibility of constructing  a model of
the auditory system, including the cortex, without  making some
modelling approximations.

Perhaps you should have a go at modelling yourself sometime?
One of the basic principles of any kind of theoretical work is
that you start out your models as simple as possible, to at
least get them up an running, so that you can make some
comparisons with the data. Then when the model breaks down you
learn something. That's how theoretical science makes progress.
These are basic principles which, as an X-theoretical
physicist, are second nature to me, but I can understand why
experimentalists have a problem with modelling.

> so it's time to either go back and
> recheck for errors of implementation or time to rethink your
> basic assumptions.

No, I don't think so. In the spirit of de Boer I will continue
to model the interaction of low-level and central cortical

On Tue, 10 Jun 1997 08:40:33  Alain de Cheveigne wrote

> >It is known that there are some small
> >discrepancies between current timing models and experimental
> >data. e.g. the slope of the pitch shift in the case of Meddis
> >and Hewitt (1991) and the predicted mistuning of the
> >fundamental in the case of Hartmann and Doty (1996). It may be
> >that the hybrid model proposed here may account for these small
> >discrepancies of the timing models, but this requires testing."

> The phenomenon reported by Hartmann and Doty (JASA, 1996) had
nothing to do with a virtual pitch shift.....

By the above I did not mean to imply that  the Hartmann and
Doty (1996) phenomenon was  shift of virtual pitch, but rather
to give another example of pure timing models which don't make
perfect agreement with the data, and, in agreement with de
Boer, to suggest that pitch phenomena require a central pattern
mechanism as well  a timing mechanism.