[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: purely spectral pitch

Dear Peter Cariani,

At 14:41 30.10.00 +0200 I sent you a copy of my private reply to John
Culling. I am happy that immediately thereafter you did not hesitate to
fight for our common position. We certainly agree that the brain does not
at all work in frequency domain, and consequently, temporal codes are most
important in all cases from the fastest to the slowest perceivable changes
of sound pressure.

There are just a few minor differences between us. I pointed to similar 140
ms latencies within A1. You are claiming a temporal contiguity window of
20-30 ms. I understood, you are dealing with the lower frequency limit of
pitch. I am, however, not sure about possible physiological correlates to
the alleged 20-30 ms and 10-15 ms windows. What about the limits 30Hz for
monaural vs. 60Hz for dichotic perception, the doubling seems to me
unlikely to be occasionally as you suggests. As far as I know, binaural
phenomena are somewhat tricky. Remember of binaural beating, the ear
dominance for perception of pitch, dichotic chords and the three types of
dichotic pitch. You know, there are several left-right crossings in the
auditory system.

Concerning your involvement in autocorrelation-like models, I go on feeling
free to generally doubt that any simple mathematical model fits the reality
of hearing. Now we may hope for finding out a piece of the physiological
background, soon. This will hopefully end the quarrel about first-order vs.
all-order ISIs.

In my opinion, the so called purely spectral code includes two aspects
simultaneously, that is, it is initially determined temporally as well as
tonotopically. You are correct, Moore uttered similar thoughts.
Unfortunately, they are still quite uncommon and I am not aware of
consequent applications. For this reason, we have still to respect the
usual terminology dividing what is actually temporal in both cases into
(purely) spectral code and temporal modulation.

At 14:46 30.10.00 -0400, you wrote:
>Hi John,
>I'm not clear on how these psychophysical findings
> bear on the nature of the neural representations
>and processing mechanisms involved. Part of my
>original point was that we need to be as clear as possible
>about whether we are talking about inferences made
>from psychopacoustical data, from neural data,
>or from neurocomputational models. We also need to
>lay out, as self-consciously as possible, the assumptions that we
>use to interpret data of one type to make inferences about
>the other.
>> There are two lines of evidence that suggest that the auditory system
>> does not process such time structure [the temporal structure of
>spike coincidences and anticoincidences that are hypothesized to be
>present in the outputs of
>binaural cross-correlators].  Brackets mine, pac.
>> Krumbholz and Patterson (1999, 2000) showed that the lowest discernible
>> pitch for complex dichotic pitches and for complex tones unmasked by
>> the binuaral system is about an octave higher than the lowest pitch of
>> stimuli that also provide temporal cues. This increase in the lower
>> limit of pitch is to be expected from a mechanism that relies on
>> spectral decomposition by the cochlea, because the ERB never gets
>> below about 30 Hz and components less that 60 Hz or so apart cannot
>> be resolved at any frequency.
>Could you unpack this a bit?
>I haven't had a chance to read Krumbholz & Patterson yet, but
>the assumption of the argument that you present seems to be
>that the dichotic pitch mechanism is based on some kind
>of frequency-domain, spectral representation because there is some
>with the behavior of ERB's, and these are conventionally defined in the
>domain. We should bear in mind that this is psychophysical data that
>reflects the capacities of the whole system, and that inferences about
>nature of the underlying neural processing involved are anything but
>In neurocomputational terms, if one considers neural processing
>mechanisms in which one
>has a temporal contiguity window of 20-30 msec  (e.g. coincidence
>arrays with a limit on the maximum relative delay), then the lowest
>periodicitites that
>can be represented/distinguished are around 30 Hz, and inputs that
>arrive outside of the
>temporal contiguity window will be processed independently (and therefore
>not summed together).  There could be different temporal
>contiguity/temporal integration
>windows for monaural and binaural processing that depend on the ranges
>of response
>latencies available to stations in the different pathways.
>This idea of a temporal contiguity constraint for patterns is
> consistent with the spectral shape integration windows of Chistovitch
>(1985, JASA) (fusion of 2 temporally-offset single-formant
>vowels into one 2-formant vowel percept), of Hall (low pitches produced by
>non-temporally overlapping harmonics), and of Turgeon & Bregman
>(fusion/masking experiments using temporally-offset flankers). On the other
>hand, the integration windows for loudness summation are much longer, so we
>shouldn't assume that all processing involves the same temporal constraints.
>In any case, it seems to me that the argument rests on the assumption that
>psychophysically-observed ERB's must necessarily be explained in terms of
>purely spectral mechanisms (mainly because these are the terms in which
>current critical band models are cast). But I don't see anything that
>logically rules out a
>temporal neural mechanism for these phenomena. (Moore discussed some of
>these ideas
>in his textbook An Intro. to the Psychology of Hearing, 3rd ed.).
>Such a model could be formulated that took into account cochlear
>and rate-level functions (bandwidths of the filters and rate-level
>affect the temporal patterns and rates of discharge that determine
>the shapes of population-wide interval distributions).
>The main difference between a temporal account and a
>rate-place account would lie not in the neural response properties per se
>(these are what they are), but what aspects of neural responses are
>analyzed by the central auditory system. Central use of temporal information
>is one way to bridge the gap that separates the coarseness of cochlear
>filters with
>the fineness of auditory perception (lateral inhibition is another).
>> Krumbholz and Patterson (2000) and Culling and Colburn (2000) have found
>> that binaural sluggishness applies to detection of temporal modulation of
>> signals unmasked by the binaural system. In other words, it detects only
>> the grossest of temporal structures.
>Slow temporal modulation of a pitch is different from the representation
>the pitch itself, and there may be different mechanisms that integrate
>images on different timescales such that fine distinctions can be made when
>the stimulis is stationary, but that changes in periodicity and/or
>location of the stimulus
>are registered more slowly. (Because we can distinguish ITD differences
>on the
>order of tens of microseconds doesn't mean we must register changes in
>that fast). So, I don't quite see that binaural sluggishness necessarily
>implies only a very coarse temporal analysis in all processing domains
>and at all levels of the system.
>-- Peter Cariani