[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] On 3D audio rendering for signals with the low sampling frequency



Dear Frederick,

Thank you so much for the references that you mentioned. 

"[...] up–down cues are located mainly in the 6–12-kHz band, and front–back cues in the 8–16-kHz band." 
According to this statement, it seems impossible to solve the problems of elevation perception and front-back confusion when the output signal is sampled at 16kHz. 
Though I know it is difficult, I always try to find some solutions.

Thanks again.

Best regards,
Junfeng 

On Sat, Aug 13, 2022 at 12:50 AM Frederick Gallun <fgallun@xxxxxxxxx> wrote:
The literature on the HRTF over the past 60 years has made it very clear that "[...] up–down cues are located mainly in the 6–12-kHz band, and front–back cues in the 8–16-kHz band." (Langendiijk and Bronkhorst, 2002)  

Here are a few places to start:

Langendijk, E. H. A., & Bronkhorst, A. W. (2002). Contribution of spectral cues to human sound localization. The Journal of the Acoustical Society of America, 112(4), 1583–1596. https://doi.org/10.1121/1.1501901

Mehrgardt, S., & Mellert, V. (1977). Transformation characteristics of the external human ear. The Journal of the Acoustical Society of America, 61(6), 1567–1576. https://doi.org/10.1121/1.381470

Shaw, E. a. G., & Teranishi, R. (1968). Sound Pressure Generated in an External‐Ear Replica and Real Human Ears by a Nearby Point Source. The Journal of the Acoustical Society of America, 44(1), 240–249. https://doi.org/10.1121/1.1911059

---------------------------------------------

Frederick (Erick) Gallun, PhD, FASA, FASHA | he/him/his

Professor, Oregon Hearing Research Center, Oregon Health & Science University
"Diversity is like being invited to a party, Inclusion is being asked to dance, and Belonging is dancing like no one’s watching" - Gregory Lewis


On Thu, Aug 11, 2022 at 11:59 PM Junfeng Li <junfeng.li.1979@xxxxxxxxx> wrote:
Dear  Leslie,

When downsampling to 8/16kHz, we really found the localization accuracy decreases, even for horizon
Do you have any good ideas to solve it?

Thanks a lot.

Best regards,
Junfeng 


On Thu, Aug 11, 2022 at 4:04 PM Prof Leslie Smith <l.s.smith@xxxxxxxxxxxxx> wrote:
I'd also wonder about the time resolution: 16KHz = 1/16000 sec between
samples = 62 microseconds
.
That's relatively long for ITD (TDOA) estimation, which would suggest that
localisation of lower frequency signals would be impeded.

(I don't have evidence for this: it's just a suggestion).

--Leslie Smith

Junfeng Li wrote:
> Dear all,
>
> We are working on 3D audio rendering for signals with low sampling
> frequency.
> As you may know, the HRTFs  are normally measured at the high sampling
> frequency, e.g., 48kHz or 44.1kHz. However, the sampling frequency of
> sound
> signals in our application is restricted to 16 kHz. Therefore, to render
> this low-frequency (≤8kHz) signal, one straight way is to first
> downsample
> the HRTFs from 48kHz/44.1kHz to 16kHz and then convolve with sound
> signals.
> However, the sound localization performance of the signal rendered with
> this approach is greatly decreased, especially elevation perception. To
> improve the sound localization performance, I am now wondering whether
> there is a certain good method to solve or mitigate this problem in this
> scenario.
>
> Any discussion is welcome.
>
> Thanks a lot again.
>
> Best regards,
> Junfeng
>


--
Prof Leslie Smith (Emeritus)
Computing Science & Mathematics,
University of Stirling, Stirling FK9 4LA
Scotland, UK
Tel +44 1786 467435
Web: http://www.cs.stir.ac.uk/~lss
Blog: http://lestheprof.com