[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: pitch neurons (2)



Hi Martin and Eckhard,

I think it's good that we are having this discussion because pitch
is one of the core auditory percepts that we need to deal with if
we are to understand how the auditory system works as an
information processing system.

Some comments about ecological imperatives (re: Martin's point
about why it should be that we should be able to perceive aspects
of sounds that are not present in the natural (non-man-made) world).
I do think that ecological factors are important in
understanding specialized sensory systems that allow animals
to deal with the specifics of their environments (the best
examples involve intraspecies communication systems like
pheromones, visual displays, and acoustic communication, where
signal production and reception occur in the same animals, and
are therefore under common genetic control and selective
pressures -- no suprise that filters matched to signals that
the animal produces can arise).

However, I do think that many evolutionary psychologists go
overboard in thinking that sensory systems are just grab-bags of
specialized mechanisms -- adaptive specializations. Some sensory
systems are adapted to be general purpose -- the species cannot
predict/control the specific appearances that other animals (e.g. predators) will
take, and therefore there is high selective pressure for systems
that handle a wide range of signals in a flexible and robust manner.
The guts of pandas and anteaters are specialized, but there are plenty of
other omnivorous animals that can handle a wide range of foods
(they need to be able to do this if their environs are unpredictable).

Do we think that because triangles are not "natural" stimuli that it
is a mystery that we can identify their forms? Of course not -- there
are general purpose mechanisms for dealing with visual form. Likewise
there are general purpose mechanisms by which the auditory system
organizes the auditory scene and analyzes periodicities (pitch, timbre, rhythm).
There are a few basic dimensions of auditory perception, not 100 or 1000.
I agree with Chris Darwin and Dick Fay that organizing the auditory scene
is perhaps the most basic of these functions. I do
believe that the basic mechanisms for our senses evolved hundreds
of millions of years ago, and like the common vertebrate body-plan,
there is are common core neurocomputational information processing
strategies that are conserved even amidst the (in some cases spectacular) evolution of the
sensory receptor organs.

Unresolved harmonics (when I hear this term I refer first to Plomp's direct
measurements of what can be "heard out and matched and what cannot", i.e. harmonics
above the 5th) can and do give rise to strong pitch percepts, and many animals
do hear these missing fundamentals. Are there any that we know of that don't?
Are there any that we know of that don't, once one takes into account the freq
limits of their hearing and their upper limits of phase-locking?
(Some people have tried to answer this question for bats, but I'm not sure that
the verdict is clear one way or the other -- it is interesting that in constant-FM
bats that have overlapping crys and echoes that the doppler-shifted beatings
that reflect relative velocity are modulations in the pitch range -- maybe  the
bat is hearing a pitch sweep as it closes on its prey -- this pitch sweep would
be due to unresolved harmonics -- low-freq temporal modulations in high CF channels.
I'm told this idea has been proposed by a German bat-researcher, but I have forgotten his name,
my apologies to you, whoever you are)

Although unresolved harmonics played an important role in older debates about temporal
vs. spectral theories of pitch, there is universal agreement that lower, resolved harmonics
produce stronger, better discriminated pitches than higher, unresolved ones. I myself think we should
concentrate first on the basic neural mechanisms that produce strong pitches (at a recent
conference it was postulated that there is a layer of hell in which sinners are condemned
to listen to nothing but unresolved harmonics for the rest of eternity -- let's not turn
psychophysics into a hell-on-earth).

Re: Steinschneider's paper, I am a big fan of their current source density recordings -- they
give a picture of what local cortical subpopulations see at their dendrites. These and their multiple unit
recordings in input layers suggest that periodicities of up to about 300 Hz are available in
cortical inputs (albeit much more weakly for 100-300 Hz). As far as I can see from their
data, however, there is little evidence for resolution of individual harmonics above the
2nd or 3rd (their 1999 paper with shifted harmonics) showed rate changes when the
harmonic spacings were if I remember correctly near 250 or 300 Hz, which is about
1/3 of an octave at the 700-800 Hz BF recording site. This is consistent with what many
other cortical single unit studies have seen -- single and multi-unit rate profiles at their
finest resolve only about 1/2 to 1/3 octave. So if you want to base a spectral pattern theory
on the first 2 or 3 harmonics, and claim that this is a viable representation at the cortical
level, go ahead, but some of us will need more convincing on this interpretation. (I have yet to
see anyone anywhere retrodict the pitch of any stimulus with any accuracy from rate-place profiles at
SPLs above 60 dB SPL.) As far as I am aware, there is no
good evidence (yet) for a harmonic spectral pattern analysis per se anywhere in the auditory CNS
in the existence region (human or otherwise) of  missing-F0 perception (as I said before, there are reports of
multipeak tuning curves, but these are invariably, as far as I can see, for BFs above 5 kHz).

First order interval representations have a number of problems -- the difficulty is that if you
jack up SPLs and increase firing rates, then longer intervals disappear from the distribution --
not very much like pitch perception. Similarly, in first order interval representations, insertion
of extra events within F0-periods that generate spikes, such as extra clicks added to an isochronous
click train, should completely destroy the F0-pitch. This does occur if one uses 2k high pass trains
(as Kaernbach showed), but not if one uses 2k-low-pass trains. The first-order interval model doesn't
work for low harmonics, but an all-order (population autocorrelation) model does work for these.
Secondly, the all-order population-interval model does show masking in the high-pass case, unlike
what Kaernbach assumed in his one-channel "straw-man" autocorrelation model. Among other things,
he assumed a one-click, one-spike correspondence, which neglects population-wide refractory dynamics.
We sorted some of these issues out at a pitch conference in August -- the issues are complicated by the
fact that the pitches themselves are very weak, near the threshold of detectability (hell-on-earth). If people are
interested, I can email the panel from my poster that shows the results of the full population-interval model.

Eckhardt, I apologize if I misread your comment about refractory times and upper limits of representation
of frequency through phase-locking. One still sees bad neuroscience textbooks that dismiss temporal codes
on the argument that frequencies above 1 k could not be encoded because of neural refractory periods.

I agree that the two working hypotheses I suggested  are not mutually exclusive (they need
sharpening up -- this has been very hard since we know so little about the nature of the central descending
systems and the overwhelming bulk of cortical physiology has concentrated on spike rates. -- nevertheless,
I think they are useful heuristics for thinking about the possibilities....).

Finally, I forgot to mention that MTFs don't explain octave similarities either (although all-order interval
reps do). What we need to realize is that even at the level of the midbrain, there is still an abundance of temporal
information (go look at some of Langner's beautiful dot-raster figures that show locking up to about 800 Hz).
This temporal information does follow pitch perception (Greenberg's 1980 FFR study showed that the
temporal response patterns (again probably in dendritic inputs to IC) followed the fine structure and could account
for de Boer's rule. Years ago, I also recorded some field potentials in the central nucleus of the IC and
found similar patterns. If someone put a gun to my head and said that I had to predict the pitch of an 80 dB
SPL stimulus based on data from 1000 IC neurons ("or else"), but that I could have my choice of whether to go
with rate-MTF functions and measured rates or with interspike interval information, I'd take the interval
information in a heartbeat.

Maybe there is a special ring of hell for neurophysiologists who, instead of being able to hear sounds,
are condemned to look at  the neural responses to sounds instead. Somehow this seems even worse than
only being able to hear unresolved harmonics.

I apologize for how long this turned out to be.

-- Peter Cariani