Re: Granular synthesis and auditory segmentation (at)


Subject: Re: Granular synthesis and auditory segmentation
From:    at <meijerNATLAB.RESEARCH.PHILIPS.COM>
Date:    Wed, 21 Oct 1998 11:40:09 +0200

[BTW, for clarity of discussion: the following is about stationary sounds only, where onsets, decay etc do not play a role, because the discussion of temporal processing in non-stationary sounds is worth a separate discussion, and things are already quite tricky without involving non-stationary sound. I add this note because the original subject of granular synthesis, and certainly my own application of that, would normally deal with non-stationary sounds.] I wrote (i.e., Peter Meijer) > I'd love to hear about *psychophysical* auditory perception > experiments that unambiguously demonstrate temporal processing > in humans in the 3 to 5 kHz range! My expectation is that such > results have not been found... Peter Cariani replied > Few psychophysical experiments unambiguously demonstrate that > a particular neural mechanism is used, because there are > many possible neural mechanisms that can carry out the > same function. What they sometimes do, however, is show > that sole reliance on a given kind of neural information is not > sufficient to account for perceptual capability (which rules > out that coding scheme) or that psychophysical judgements > covary with the availability of particular kinds of > neural information (which suggests but does not prove that > that particular information is used). Granted! I agree with your refinements. Hence, I would now equally love to hear about *psychophysical* auditory perception experiments that (hopefully) unambiguously demonstrate (or at least make highly plausible) that a "place theory of hearing" is *in*sufficient (cannot fully account for what happens) in the 3 to 5 kHz range. This is a much weaker requirement than formulated in my earlier request. Jont Allen replied to the same section > I think this question needs some clarification. If you beat two > tones at 10 kHz, say beat 10 and 10.05 kHz tones, you will hear > the 50 HZ beat. This is clearly due to "temporal processing" > above 5 kHz. I'll try to clarify: (neural) "temporal processing" above 5 kHz is not needed for your example, because the mechanical filtering and half-wave rectification of your 10 + 10.05 kHz tone gives a strong 50 Hz component entering the auditory nerve. This 50 Hz component can in fact be viewed as a demodulated envelope of the original 10 + 10.05 kHz tone. The 50 Hz component is well within regular neural bandwidth (we don't even need the "volley principle" for that) and will most likely also be seen in the interspike periodicities inside the auditory nerve, and it is well below the 3 to 5 kHz range I was attempting to formulate my hypothesis for. In other words, I think the only "temporal processing" is here on the 50 Hz accounted for by place theory and half-wave rectification: no need for (neural) "temporal processing" above 5 kHz here. What I basically want to know is what the "volley principle" nerve frequencies in the 3-5 kHz range bring us functionally, if that helps to clarify what I am after. Peter Cariani added > -- one can still make good octave judgments if the upper tone > is at 3 kHz, but this becomes guesswork by the time one gets up > to 5 kHz. OK, I like this one. That *could* be a good argument to make temporal processing up to 3 kHz plausible for explaining this psychophysically observable effect, because one most likely needs temporal periodicity information to obtain an absolute reference for making an octave detectable as something "special". (Within a filter bank an octave would not appear as special.) I checked Brian Moore's Psychology of Hearing again on this, and even found (p. 209, 4th edition) a remark that ``octave matches largely disappear above 5 kHz, the frequency at which at which neural synchrony no longer appears to operate.'' I did a few informal listening experiments on this myself, and found that I became a lot less accurate in finding the octave from 1500 to 3000 Hz than I was in finding the octave from 1000 to 2000 Hz, so I tend to think that octave fitting largely starts to break down somewhere between 2 kHz and 3 kHz? This is not a scientific result, of course, but just my informal subjective result. In other words, the question here becomes if the octave matching breaks down in, say, the 2-3 kHz range or in the 3-5 kHz range (as Brian Moore seems to suggest). If it is in the 3-5 kHz range, that would indeed (probably) falsify my hypothesis, but... Another important question that would need an answer before I grant that my layman hypothesis has been falsified: Was this octave matching up to 3-5 kHz done with (very) low intensity tones such that nonlinearities can be neglected? If not, then nonlinear effects generate a 2.5 kHz combination tone from a { 2500, 5000 Hz } pair, and temporal processing up to "only" 2.5 kHz may then account for everything! In that case I would tend to maintain my hypothesis that consequences of temporal processing are not psychophysically observable in the 3-5 kHz range. Even if one tone from the pair was presented after the other, one has to be careful that in the above example the 5000 Hz harmonic of the 2500 Hz tone is not matched against the "pure" 5000 Hz tone in a way that regular "place theory" could easily account for. > From 3kHz to 5 kHz the quality of timing information as > well as tonality and frequency discrimination decline > precipitously. At 3kHz there is considerable phase-locking; > at 5 kHz it is much much weaker. Is tonality above 3 kHz a quality that really requires an accuracy that cannot be obtained/explained from place theory, possibly via lower frequency combination tones? > But why should the burden of proof be placed on just one > putative coding scheme? What in your opinion is the > unambiguous evidence in favor of some other (name your > favorite) coding scheme in the 3-5 kHz range? With the reformulated request I tend to maintain that the place theory of hearing is sufficient to account for what is observed psychophysically in the 3-5 kHz range, rather than say that it is the only possible account. As a matter of fact, temporal processing could in principle account for everything, since it can encompass any type of filterbank. However, if it really were that powerful, we would have no need for a cochlea at all, and evolution would have had little incentive to give us a cochlea. Moreover, the very absence (?) of perception of effects that should be quite easy to detect via temporal processing leaves the absence of significant (neural) temporal processing in the 3-5 kHz range rather plausible to me. > 2. Phase locking and frequency discriminations covary. > ... > whereas rate place information shows the opposite trend, > getting relatively better as frequency increases. I don't understand this. One could have it both ways, depending on how the cochlear filterbank is actually constructed. If it were constructed to mimic Fourier analysis, it would even become frequency independent. Do you imply that cochlear mechanical filtering actually gives higher (relative?) accuracy at higher frequencies? What do you mean by "relatively" in "relatively better"? Also, covariation is a risky argument. In Holland there is a clear covariation between stork population density and local human family size, but most of us no longer conclude that storks bring babies. Maybe I'm just missing your point... Let me emphasize that I greatly appreciated the comments given by Peter Cariani and Jont Allen, as these certainly help me deepen my understanding of the topic and its many pitfalls. If it really turns out that neural processing has significant psychophysically observable effects up to 5 kHz, that would be quite fascinating. Things like combination tones and harmonics can rather easily fool us, though, since with nonlinear effects we only need half the neural processing frequency (2.5 kHz) to account for many psychophysically observable effects that would at first sight *seem* to imply (neural) "temporal processing" up to, say, 5 kHz. Best wishes, Peter Meijer Soundscapes from The vOICe - Seeing with your Ears! http://ourworld.compuserve.com/homepages/Peter_Meijer/ McGill is running a new version of LISTSERV (1.8d on Windows NT). Information is available on the WEB at http://www.mcgill.ca/cc/listserv


This message came from the mail archive
http://www.auditory.org/postings/1998/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University