Re: pitch neurons (2) (Cariani )


Subject: Re: pitch neurons (2)
From:    Cariani  <peter(at)EPL.MEEI.HARVARD.EDU>
Date:    Fri, 11 Oct 2002 15:24:27 -0400

Hi Martin and Eckhard, I think it's good that we are having this discussion because pitch is one of the core auditory percepts that we need to deal with if we are to understand how the auditory system works as an information processing system. Some comments about ecological imperatives (re: Martin's point about why it should be that we should be able to perceive aspects of sounds that are not present in the natural (non-man-made) world). I do think that ecological factors are important in understanding specialized sensory systems that allow animals to deal with the specifics of their environments (the best examples involve intraspecies communication systems like pheromones, visual displays, and acoustic communication, where signal production and reception occur in the same animals, and are therefore under common genetic control and selective pressures -- no suprise that filters matched to signals that the animal produces can arise). However, I do think that many evolutionary psychologists go overboard in thinking that sensory systems are just grab-bags of specialized mechanisms -- adaptive specializations. Some sensory systems are adapted to be general purpose -- the species cannot predict/control the specific appearances that other animals (e.g. predators) will take, and therefore there is high selective pressure for systems that handle a wide range of signals in a flexible and robust manner. The guts of pandas and anteaters are specialized, but there are plenty of other omnivorous animals that can handle a wide range of foods (they need to be able to do this if their environs are unpredictable). Do we think that because triangles are not "natural" stimuli that it is a mystery that we can identify their forms? Of course not -- there are general purpose mechanisms for dealing with visual form. Likewise there are general purpose mechanisms by which the auditory system organizes the auditory scene and analyzes periodicities (pitch, timbre, rhythm). There are a few basic dimensions of auditory perception, not 100 or 1000. I agree with Chris Darwin and Dick Fay that organizing the auditory scene is perhaps the most basic of these functions. I do believe that the basic mechanisms for our senses evolved hundreds of millions of years ago, and like the common vertebrate body-plan, there is are common core neurocomputational information processing strategies that are conserved even amidst the (in some cases spectacular) evolution of the sensory receptor organs. Unresolved harmonics (when I hear this term I refer first to Plomp's direct measurements of what can be "heard out and matched and what cannot", i.e. harmonics above the 5th) can and do give rise to strong pitch percepts, and many animals do hear these missing fundamentals. Are there any that we know of that don't? Are there any that we know of that don't, once one takes into account the freq limits of their hearing and their upper limits of phase-locking? (Some people have tried to answer this question for bats, but I'm not sure that the verdict is clear one way or the other -- it is interesting that in constant-FM bats that have overlapping crys and echoes that the doppler-shifted beatings that reflect relative velocity are modulations in the pitch range -- maybe the bat is hearing a pitch sweep as it closes on its prey -- this pitch sweep would be due to unresolved harmonics -- low-freq temporal modulations in high CF channels. I'm told this idea has been proposed by a German bat-researcher, but I have forgotten his name, my apologies to you, whoever you are) Although unresolved harmonics played an important role in older debates about temporal vs. spectral theories of pitch, there is universal agreement that lower, resolved harmonics produce stronger, better discriminated pitches than higher, unresolved ones. I myself think we should concentrate first on the basic neural mechanisms that produce strong pitches (at a recent conference it was postulated that there is a layer of hell in which sinners are condemned to listen to nothing but unresolved harmonics for the rest of eternity -- let's not turn psychophysics into a hell-on-earth). Re: Steinschneider's paper, I am a big fan of their current source density recordings -- they give a picture of what local cortical subpopulations see at their dendrites. These and their multiple unit recordings in input layers suggest that periodicities of up to about 300 Hz are available in cortical inputs (albeit much more weakly for 100-300 Hz). As far as I can see from their data, however, there is little evidence for resolution of individual harmonics above the 2nd or 3rd (their 1999 paper with shifted harmonics) showed rate changes when the harmonic spacings were if I remember correctly near 250 or 300 Hz, which is about 1/3 of an octave at the 700-800 Hz BF recording site. This is consistent with what many other cortical single unit studies have seen -- single and multi-unit rate profiles at their finest resolve only about 1/2 to 1/3 octave. So if you want to base a spectral pattern theory on the first 2 or 3 harmonics, and claim that this is a viable representation at the cortical level, go ahead, but some of us will need more convincing on this interpretation. (I have yet to see anyone anywhere retrodict the pitch of any stimulus with any accuracy from rate-place profiles at SPLs above 60 dB SPL.) As far as I am aware, there is no good evidence (yet) for a harmonic spectral pattern analysis per se anywhere in the auditory CNS in the existence region (human or otherwise) of missing-F0 perception (as I said before, there are reports of multipeak tuning curves, but these are invariably, as far as I can see, for BFs above 5 kHz). First order interval representations have a number of problems -- the difficulty is that if you jack up SPLs and increase firing rates, then longer intervals disappear from the distribution -- not very much like pitch perception. Similarly, in first order interval representations, insertion of extra events within F0-periods that generate spikes, such as extra clicks added to an isochronous click train, should completely destroy the F0-pitch. This does occur if one uses 2k high pass trains (as Kaernbach showed), but not if one uses 2k-low-pass trains. The first-order interval model doesn't work for low harmonics, but an all-order (population autocorrelation) model does work for these. Secondly, the all-order population-interval model does show masking in the high-pass case, unlike what Kaernbach assumed in his one-channel "straw-man" autocorrelation model. Among other things, he assumed a one-click, one-spike correspondence, which neglects population-wide refractory dynamics. We sorted some of these issues out at a pitch conference in August -- the issues are complicated by the fact that the pitches themselves are very weak, near the threshold of detectability (hell-on-earth). If people are interested, I can email the panel from my poster that shows the results of the full population-interval model. Eckhardt, I apologize if I misread your comment about refractory times and upper limits of representation of frequency through phase-locking. One still sees bad neuroscience textbooks that dismiss temporal codes on the argument that frequencies above 1 k could not be encoded because of neural refractory periods. I agree that the two working hypotheses I suggested are not mutually exclusive (they need sharpening up -- this has been very hard since we know so little about the nature of the central descending systems and the overwhelming bulk of cortical physiology has concentrated on spike rates. -- nevertheless, I think they are useful heuristics for thinking about the possibilities....). Finally, I forgot to mention that MTFs don't explain octave similarities either (although all-order interval reps do). What we need to realize is that even at the level of the midbrain, there is still an abundance of temporal information (go look at some of Langner's beautiful dot-raster figures that show locking up to about 800 Hz). This temporal information does follow pitch perception (Greenberg's 1980 FFR study showed that the temporal response patterns (again probably in dendritic inputs to IC) followed the fine structure and could account for de Boer's rule. Years ago, I also recorded some field potentials in the central nucleus of the IC and found similar patterns. If someone put a gun to my head and said that I had to predict the pitch of an 80 dB SPL stimulus based on data from 1000 IC neurons ("or else"), but that I could have my choice of whether to go with rate-MTF functions and measured rates or with interspike interval information, I'd take the interval information in a heartbeat. Maybe there is a special ring of hell for neurophysiologists who, instead of being able to hear sounds, are condemned to look at the neural responses to sounds instead. Somehow this seems even worse than only being able to hear unresolved harmonics. I apologize for how long this turned out to be. -- Peter Cariani


This message came from the mail archive
http://www.auditory.org/postings/2002/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University