[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Time and space

To: Multiple recipients of list AUDITORY <AUDITORY@xxxxxxxxxxxxxx>
Subject: Re: Time and space
From: Neil Todd <todd@xxxxxxxxxxxxxxxxxx>
Date: Mon, 9 Jun 1997 15:00:33 +0100
Reply-to: Neil Todd <todd@xxxxxxxxxxxxxxxxxx>
Sender: Research in auditory perception <AUDITORY@xxxxxxxxxxxxxx>
Dear Pierre


On Fri Jun  6 20:39 BST 1997 Pierre Divenyi wrote:

> Dear Neil

> Indeed, it would be nice if the state-of-the-brain were as you describe it:
> low-level time and frequency analysis represented orthogonally in the
> cortex. While it is true that Gerald Langner has found single units in
> his animals, and that the human MEG data look at least consistent,
> he would be the first to say loud that your generalization of the data
> are scarcely more than wishful thinking.

Yes indeed, and he did say something like that to me on a
recent visit to Darmstadt.

> In particular, the time ranges
> some of us have been talking about during the present exchange of views
> go down to very-very low frequencies (=periods as long as 100-200 ms)
> which, as far as I know, have not been found to be well represented at
> the CN or the IC -- but correct me if I am wrong.

If you look back at all my previous messages you will see that
I have consistently argued that as far a temporal frequency is
concerned, there is a need for three temporal frequency
dimensions. (i) cochleotopic (approx. 30 - 10, 000 Hz), (ii)
periodotopic (approx. 10 Hz - 1000 Hz) and (iii) a time-scale
dimension (approx. 0.5 - 20 Hz).  It is this third temporal
frequency dimension that takes care of your long periods, and
also discrimination (see Todd and Brown, 1996) and streaming
(see Todd, 1996) I would argue.

In fact in the most recent version of my model (to appear in
Todd and Lee,  in Steve Greenberg's edited book) a cortical
unit requires 9 parameters to describe its receptive field. If
we assume that the cortex does have access to somekind of 2-D
cochleotopic/periodotopic array (whether it is sub-cortical in
origin or not) then clearly this will be changing over time. A
unit may be labeled according its CF and BMF and clearly its
activity is time dependent.  So, in terms of modelling we may
consider the cortex to have 3 input dimensions
(i) fc,
(ii) fm and
(iii) time t.

If we assume that , like the visual cortex,  this input is
decomposed by populations 3-D spatio-temporal filters, then
effectively the flow of information can be represented by
appropriately tuning and orienting these filters in a 3-D
scale-space. This requires 6 other parameters
(iv) cochleotopic spatial freq. (cycles per oct.)
(v) periodotopic spatial freq. (cycles per oct.)
(vi) time-scale freq. (Hz - the same dimension as above)
the centriod in scale-space and
(vii) cochleotopic space-constant
(viii) periodotopic space-constant
(ix) time-constant
the spatio-temporal window.

This model generates a number of different response types
including AM and FM,
low-pass and band-pass, spatial and temporal, which can be used
to described primitive RFs for stationary and moving pitch and
timbral acoustic features. As I said in the previous message,
recent physiology (both Schreiner and Shamma) has demonstrated
dynamic spatio-temporal RFs

> Furthermore, even if all you say about auditory time/frequency analysis
> in the cortex, there is still the phenomenon Al and I were referring to
> to explain: temporal (i.e., envelope-) patterns marked by signals in
> different frequency bands tend to divide into two streams and suffer
> a loss of discriminability.

In Todd and Brown (1996) we showed that one could certainly
account for the shape of the psychophysical law for time
interval discrimination in terms of a population of assumed
cortical band-pass AM sensitive cells. In the streaming model
[Todd, N.P.McAngus (1996) An auditory cortical theory of
auditory stream segregation. Network : Computation in Neural
Systems. 7, 349-356.]
this population would be bifurcated,  thus reducing sensitivity
since there would be less neurones in each stream with which to
make a discrimination.

> Having done lots of experiments on the latter,
> and having tried to model the situation, the phenomenon in question
> looks as if what we hear (=the perceived temporal patterns) were mediated
> by an extra stage whenever the markers do not activatte the same
> pool of neurons. Again, I would not object to the view that this extra
> stage is also located at a subcortical level, but you must admit that
> the data are not there to support the view (at least I haven't seen
> them in the time range we are talking about). Thus, a more parsimonious
> explanation, to my mind at least, would be to make the cortex responsible
> for keeping track of envelope timing information altogether.


Further, I have showed that one can account for the shape of
the psychophysical law of pure tone AM detection which has two
points of maximum sensitivity, one about 3 Hz and another about
300 Hz. This severe departure from Weber's Law I have suggested
is because AM detection is mediated by two separate populations
- one cortical the other subcortical [Todd, N.P.McAngus (1996)
Time discrimination and AM detection. J. Acoust. Soc. Am.
100(4), Pt. 2, 2752.]. As far as I am aware I do not know of
any other model which accounts for the hump in the middle -
unless I am also wrong?


Neil


Dear Peter

On Thu Jun  5 22:42 BST 1997 Peter Cariani wrote:

> Lastly, in Bertrand's and my work in the auditory nerve
> it also became apparent to me that what is needed for
> periodicity pitch is an autocorrelation-like analysis of
> of periodicity pitch, .....

> My impression is that the
> modulation detector idea will not work for the pitch
> shift of AM tones (first effect of pitch shift, or
 > de Boer's rule)......

> I think these are critical issues for models based on
> time-to-place that I tried to bring up in Montreal last
> summer (I didn't do it to give you a hard time, I promise).


Yes, it is very clear that you are a strong proponent of this
class of model and certainly it is true that power spectrum
type models only respond strongly to first order intervals.
However,  if you had actually read the proceedings (and
listened to what I said then) you would have noticed that I did
indeed address these issues.  Autocorrelation models are good,
but they are not perfect. To save you looking up your copy of
the Proc. ICMPC96 I quote below the relevant section.

"The next phenomenon we consider is that of virtual pitch
shift. Virtual pitch shift is described as the shift in
perceived pitch of a complex in which all the partials have
been shifted by a constant amount, so that they are no longer
harmonic. In the literature there are generally two distinct
accounts of pitch shift (Hartmann and Doty, 1996). Proponents
of temporal theories argue that pitch shift can be only
accounted for in the fine temporal structure of the AN
response,  since the envelope remains invariant. Earlier
place-pattern accounts are that pitch shift results from the
disparity between a central pattern and the excitation pattern,
i.e. a best fitting harmonic series.

It is of interest to see if pitch shift can be predicted by
[place-time] pattern matching against the sensory memory traces
without the spiking component. The signals for this example
were obtained from Example 21 of the ASA Auditory
Demonstrations CD and consist of the 4th, 5th and 6th harmonics
of a 200 Hz complex. As would be expected (without spiking) as
the three partials shift upwards their interaction terms remain
relatively invariant at 200 Hz. However, the pitch strength as
measured by response of the recognition space (Figure 2) does
indeed show the correct type of pitch shift including pitch
ambiguity. The strongest response is obtained for the harmonic
complex 800, 1000, 1200 Hz, although the estimated pitch is a
little less than 200 Hz. The 860, 1060, 1260 Hz complex
produces a pitch estimate at about 210 Hz.  The 900, 1100, 1300
Hz complex produces an apparent bimodal response with one
estimate at about 215 Hz and another at about 185 Hz. The 960,
1160, 1360 Hz complex also gives a bimodal response with
estimates of about 230 Hz and 190 Hz.

Clearly this simple pattern matching model does seem to give an
account of pitch shift without fine structure. How then may
this be reconciled with the fine structure account? In order to
investigate the effect of fine structure in the model,  a
shifted complex (900, 1100, 1300 Hz) was presented to the model
including the spiking component (see Figure 3). It is clear
from figure 3 that although fine structure appears to be only
weakly represented in the model it does have the effect of
shifting the interaction term to about 215 Hz. It appears then
that both fine structure and central matching contribute to
pitch shift. It may be that the central auditory system
combines envelope and neural timing information (Cooke and
Brown, 1994) into a single image which is available for
learning and recognition. It is known that there are some small
discrepancies between current timing models and experimental
data. e.g. the slope of the pitch shift in the case of Meddis
and Hewitt (1991) and the predicted mistuning of the
fundamental in the case of Hartmann and Doty (1996). It may be
that the hybrid model proposed here may account for these small
discrepancies of the timing models, but this requires testing."
[Proc. ICMPC 96. p. 180-181.]

To be quite honest, I am quite agnostic as to what mechanism is
responsible at a sub-cortical level for the time to place
mapping. Both power specta and autocorrelation (which are
Fourier twins don't forget) do this. Actually, the most
neurologically plausible model I have seen is Gerald Langner's
(neither autocorrelation nor power spectrum) since it includes
both DCN and VCN components. Either way it does not alter the
cortical model I have proposed.

Neil
Prev by Date: Time and space
Next by Date: Re: Time and space
Previous by thread: Time and space
Next by thread: Re: Time and space
Index(es):
- Date
- Thread