Re: [AUDITORY] On 3D audio rendering for signals with the low sampling frequency (Junfeng Li )


Subject: Re: [AUDITORY] On 3D audio rendering for signals with the low sampling frequency
From:    Junfeng Li  <junfeng.li.1979@xxxxxxxx>
Date:    Fri, 12 Aug 2022 14:32:26 +0800

--000000000000b9142f05e605722d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Dear Leslie, When downsampling to 8/16kHz, we really found the localization accuracy decreases, even for horizon Do you have any good ideas to solve it? Thanks a lot. Best regards, Junfeng On Thu, Aug 11, 2022 at 4:04 PM Prof Leslie Smith <l.s.smith@xxxxxxxx> wrote: > I'd also wonder about the time resolution: 16KHz =3D 1/16000 sec between > samples =3D 62 microseconds > . > That's relatively long for ITD (TDOA) estimation, which would suggest tha= t > localisation of lower frequency signals would be impeded. > > (I don't have evidence for this: it's just a suggestion). > > --Leslie Smith > > Junfeng Li wrote: > > Dear all, > > > > We are working on 3D audio rendering for signals with low sampling > > frequency. > > As you may know, the HRTFs are normally measured at the high sampling > > frequency, e.g., 48kHz or 44.1kHz. However, the sampling frequency of > > sound > > signals in our application is restricted to 16 kHz. Therefore, to rende= r > > this low-frequency (=E2=89=A48kHz) signal, one straight way is to first > > downsample > > the HRTFs from 48kHz/44.1kHz to 16kHz and then convolve with sound > > signals. > > However, the sound localization performance of the signal rendered with > > this approach is greatly decreased, especially elevation perception. To > > improve the sound localization performance, I am now wondering whether > > there is a certain good method to solve or mitigate this problem in thi= s > > scenario. > > > > Any discussion is welcome. > > > > Thanks a lot again. > > > > Best regards, > > Junfeng > > > > > -- > Prof Leslie Smith (Emeritus) > Computing Science & Mathematics, > University of Stirling, Stirling FK9 4LA > Scotland, UK > Tel +44 1786 467435 > Web: http://www.cs.stir.ac.uk/~lss > Blog: http://lestheprof.com > > --000000000000b9142f05e605722d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">Dear=C2=A0 Leslie,<div><br></div><div>When downsampling to 8/16kHz, we really found th= e localization accuracy decreases,=C2=A0even for horizon</div><div>Do you h= ave any good ideas to solve it?</div><div><br></div><div>Thanks a lot.</div= ><div><br></div><div>Best regards,</div><div>Junfeng=C2=A0</div><div><br></= div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_at= tr">On Thu, Aug 11, 2022 at 4:04 PM Prof Leslie Smith &lt;<a href=3D"mailto= :l.s.smith@xxxxxxxx">l.s.smith@xxxxxxxx</a>&gt; wrote:<br></div><= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">I&#39;d also wonder about = the time resolution: 16KHz =3D 1/16000 sec between<br> samples =3D 62 microseconds<br> .<br> That&#39;s relatively long for ITD (TDOA) estimation, which would suggest t= hat<br> localisation of lower frequency signals would be impeded.<br> <br> (I don&#39;t have evidence for this: it&#39;s just a suggestion).<br> <br> --Leslie Smith<br> <br> Junfeng Li wrote:<br> &gt; Dear all,<br> &gt;<br> &gt; We are working on 3D audio rendering for signals with low sampling<br> &gt; frequency.<br> &gt; As you may know, the HRTFs=C2=A0 are normally measured at the high sam= pling<br> &gt; frequency, e.g., 48kHz or 44.1kHz. However, the sampling frequency of<= br> &gt; sound<br> &gt; signals in our application is restricted to 16 kHz. Therefore, to rend= er<br> &gt; this low-frequency (=E2=89=A48kHz) signal, one straight way is to firs= t<br> &gt; downsample<br> &gt; the HRTFs from 48kHz/44.1kHz to 16kHz and then convolve with sound<br> &gt; signals.<br> &gt; However, the sound localization performance of the signal rendered wit= h<br> &gt; this approach is greatly decreased, especially elevation perception. T= o<br> &gt; improve the sound localization performance, I am now wondering whether= <br> &gt; there is a certain good method to solve or mitigate this problem in th= is<br> &gt; scenario.<br> &gt;<br> &gt; Any discussion is welcome.<br> &gt;<br> &gt; Thanks a lot again.<br> &gt;<br> &gt; Best regards,<br> &gt; Junfeng<br> &gt;<br> <br> <br> -- <br> Prof Leslie Smith (Emeritus)<br> Computing Science &amp; Mathematics,<br> University of Stirling, Stirling FK9 4LA<br> Scotland, UK<br> Tel +44 1786 467435<br> Web: <a href=3D"http://www.cs.stir.ac.uk/~lss" rel=3D"noreferrer" target=3D= "_blank">http://www.cs.stir.ac.uk/~lss</a><br> Blog: <a href=3D"http://lestheprof.com" rel=3D"noreferrer" target=3D"_blank= ">http://lestheprof.com</a><br> <br> </blockquote></div> --000000000000b9142f05e605722d--


This message came from the mail archive
src/postings/2022/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University