[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] On 3D audio rendering for signals with the low sampling frequency



Dear John and all,

In the task you've referred to I aimed to obtain the individual threshold of discriminating between fast sequences of two or three Gabor pulses centered at 6 kHz. This in turn was converted to a bounded estimate of the instantaneous sampling rate that was hypothetically necessary to achieve this result, assuming that the auditory system samples the stimulus and is not immune from (instantaneous) aliasing (i.e., it does not have an anti-aliasing filter equivalent). However, I tested only 10 normal-hearing people (PTA<20 dB HL) and no hearing-impaired participants. But I should note that the entire experiment should be replicated at some point with more test points and better blocking. With this caveat in mind, I can only offer an anecdotal observation about the threshold-in-noise test variation.

In the normal variation of the test, the best performing listeners had thresholds that were very similar to other temporal acuity tests (2-4 ms), such as gap detection, whereas other subjects achieved much worse (longer) thresholds (up to 20-35 ms). When notched-broadband noise was added to mask the off-frequency channels, the results were split to two clusters. The thresholds of four subjects, including the three youngest participants (< 25 years old), did not change as a result of the noise, regardless of their baseline threshold. The thresholds of the other six subjects significantly deteriorated, also regardless of their individual baseline. Some of the subjects in the latter group self-reported difficulties of hearing speech in noise, whereas none of the four "high-performing" subjects reported any such difficulty. I found it all very curious at the time, but did not have much data to work with to draw any conclusions. 

Incidentally, the split in the subject group is not unlike the pattern that is reported in informational-masking studies, where only a subset of the normal-hearing subjects are sensitive to off-frequency tonal maskers (e.g., Neff, Dethles, Jesteadt, 1993).

All the best,
Adam.

On Wed, Aug 17, 2022, at 5:57 AM, Beerends, J.G. (John) wrote:

Hi Chris, Adam, All,

 

Should this effect of dense sampling at the onset have a practical consequence of assessing hearing loss also with a 2 / 3 pulse discrimination task?

I am not aware that such test are used or being developed.

(this idea was inspired by Appendix E of Adams book Weisser, A. (2021). Treatise on Hearing: The Temporal Auditory Imaging Theory Inspired by Optics and Communication. arXiv preprint arXiv:2111.04338.

)

 

Regards,

John Beerends

http://beesikk.nl/JohnBeerends.htm

 

From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx> On Behalf Of Chris Stecker
Sent: dinsdag 16 augustus 2022 20:16
To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: On 3D audio rendering for signals with the low sampling frequency

 

Hi all, Particularly Leslie and Adam:

 

 

The ready availability of binaural information at sound onsets and other positive fluctuations of the amplitude envelope is well supported by decades of psychophysical evidence, including 20 years of my own publications. The overall evidence, and the theory which it motivates (“RESTART theory”) is reviewed in a 2020 chapter of the Springer Handbook on Auditory Research by myself, Les Bernstein, and Andrew Brown:

 

Stecker, G. C., Bernstein, L. R., and Brown, A. D. (2020). Binaural hearing with temporally complex signals. Chapter 5 in Goupell, M. J., Litovsky, R. Y, Popper, A. N., and Fay, R. R. (eds). Springer Handbook of Auditory Research Vol 73: Binaural Hearing. Switzerland: Springer International. doi:10.1007/978-3-030-57100-9 

 

Please contact me if you need help accessing the chapter. 

 

In quick summary, the evidence suggests that all forms of binaural cue (ITD of the envelope and fine structure, ILD, etc) available at any cochlear place (i.e. frequency) are specifically “sampled” at moments of positive envelope fluctuation. As Adam suggests, one obvious source of this “sampling” process is the strong adaptation exhibited in neural pathways prior to binaural interaction (e.g. hair cells, AN fibers, various cells of the cochlear nucleus). Indeed, phenomenological models that include realistic adaptive behavior exhibit many of the same properties observed psychophysically (Stecker 2020, Assoc Res Otolaryngol Abs 43). 

 

A feature of the data which is sometimes overlooked is the apparent refractory nature of this “sampling” process. New samples, or “onsets” can occur in succession, but not much more quickly than 200-300 times per second (3-5 ms). Above that rate (e.g. for rapid paired pulses, “steady” tones, etc.) binaural processing is confined to the overall onset. This rate limitation itself defines what counts as an “onset” for binaural processing: below the critical rate, successive events each contribute roughly equally and independently to spatial perception. 

 

What does this have to do with spatial cue representation at low sampling rates? Many of the mentions in this thread quite rightly invoke linear systems theory to understand the consequences of limiting bandwidth (i.e. due to slow sampling) on these representations. Various tricks may be suggested to somewhat extend the effective bandwidth (e.g. non-uniform sampling, etc.). I don’t have much to add there, except to consider how the brain might do it. 

 

In my view, it is important to keep in mind that no mechanisms of the ear or brain are, in fact, linear. Neuronal adaptation is highly nonlinear and also temporally asymmetric. A consequence is dramatic over-representation of rapid onset-like events–events that, in a linear system, would imply very broad bandwidth. Thus, auditory “channels” are capable of representations that apparently exceed the narrow "bandwidth” implied by their cochlear-place selectivity. That notion seems absurd on its face, because many of us have been trained to think about auditory function as "quasi-linear” (e.g. using terms like “auditory filter” to refer to neural pathways that are clearly not filters). But in fact it should not be surprising based on the actual physiology. 

 

This has clear consequences for loads of phenomena in binaural and spatial hearing: precedence, binaural adaptation, jitter in CI pulse timing, “straightness”, etc. (Stecker, Dietz, and Stern 2019(A), JASA 145:1759). 

 

Thank you for your attention, and for the interesting discussion! 

 

-Chris

 

 

 


 

G. Christopher Stecker, Ph.D., F.A.S.A.

 

Director, Spatial Hearing Lab

Director, Research Technology

Boys Town National Research Hospital

 

Coordinating Editor, Psychological and Physiological Acoustics

Journal of the Acoustical Society of America

 

 

 

 

 

 

 

 

 

 

On Aug 15, 2022, at 3:23 AM, Prof Leslie Smith <l.s.smith@xxxxxxxxxxxxx> wrote:

 


Dear all:

Some years ago, I worked on using sound at onsets for calculating source
direction in reverberant environments [1]. It's kind-of obvious, because
after the onset, the sound at the ear/microphone is made up of energy both
from the source and from reflections.

Sampling rates are normally constant, and techniques for compression are
aimed at recreating the percept of the original sound: I am under the
impression that this doesn't extend to the percept of precise location of
the sound. Perhaps we need novel compression/decompression  techniques
that include the relevant data for source location.

[1] L.S. Smith, S. Collins Determining ITDs using two microphones on a
flat panel during onset intervals with a biologically inspired spike based
technique
IEEE Transactions of Audio, Speech and Language Processing, 15, 8,
2278-2286, (2007).

--Leslie Smith

Adam Weisser wrote:




1. Compressed sensing - This heavily researched signal-processing method
uses signal sparsity to faithfully reconstruct undersampled signals [1].



.....



Neural adaptation can be thought of as dense
sampling of the signal around its onset / transient portion, which becomes
more sparsely sampled quickly after the onset. Because of adaptation, this
effect is very illusive, but I believe that it is measurable
notwithstanding. I tried to demonstrate it psychoacoustically in Appendix
E of [4]. While I don't know how it relates to binaural processing
directly, there may be instantaneous effects that may be detectable there
too, given that the input to both processing types is the same.

All the best,
Adam.



...


--
Prof Leslie Smith (Emeritus)
Computing Science & Mathematics,
University of Stirling, Stirling FK9 4LA
Scotland, UK
Tel +44 1786 467435


 

 


This message may contain information that is not intended for you. If you are not the addressee or if this message was sent to you by mistake, you are requested to inform the sender and delete the message. TNO accepts no liability for the content of this e-mail, for the manner in which you use it and for damage of any kind resulting from the risks inherent to the electronic transmission of messages.