[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] On 3D audio rendering for signals with the low sampling frequency



Dear Junfeng,

Thanks for sharing this observation here. I do not have a solution now but curious to know more.
I can relate the loss in elevation to poor capture of the spectral notches present in the HRTF. But I did not assume that the notches beyond 8kHz are this crucial. Are the HRTF personalized?

Also, I am now wondering, is it always the case that elevation information is poor for 16 kHz audio signals. Is there literature on this?
Just a quick shot, I will also try downsampling (without low pass filtering) the HRTF to 16 kHz and see if the aliased HRTF spectrum significantly corrupts the 3-D perception. I will bet - not much. But will keep fingers crossed.

Cheers,
Neeraj

On Thu, Aug 11, 2022 at 11:04 AM Junfeng Li <junfeng.li.1979@xxxxxxxxx> wrote:
Dear Dick, 

Thanks a lot for your information.

Yeah, the main problem for us is the limitation of the 16kHz sampling frequency at the output side. Therefore, even if we do bandwidth extension for input signal, we have to downsample to 16kHz after 3D rendering processing. I am wondering there is any possible/potential method using some pychoacoustic principle, like that?

Thanks again.

Best regards
Junfeng 

On Thu, Aug 11, 2022 at 12:29 PM Richard F. Lyon <dicklyon@xxxxxxx> wrote:
You could do "bandwidth extension" on the signals you want to spatialize, e.g. with some of the methods at
and then apply the high-sample-rate HRTFs. 
Of course, if your system has a 16 ksps limitation on the output side, that will be of no use.

Dick


On Wed, Aug 10, 2022 at 9:22 PM Junfeng Li <junfeng.li.1979@xxxxxxxxx> wrote:
Dear all, 

We are working on 3D audio rendering for signals with low sampling frequency. 
As you may know, the HRTFs  are normally measured at the high sampling frequency, e.g., 48kHz or 44.1kHz. However, the sampling frequency of sound signals in our application is restricted to 16 kHz. Therefore, to render this low-frequency (≤8kHz) signal, one straight way is to first downsample the HRTFs from 48kHz/44.1kHz to 16kHz and then convolve with sound signals. However, the sound localization performance of the signal rendered with this approach is greatly decreased, especially elevation perception. To improve the sound localization performance, I am now wondering whether there is a certain good method to solve or mitigate this problem in this scenario. 

Any discussion is welcome.

Thanks a lot again.

Best regards,
Junfeng