Re: [AUDITORY] On 3D audio rendering for signals with the low sampling frequency (Junfeng Li )


Subject: Re: [AUDITORY] On 3D audio rendering for signals with the low sampling frequency
From:    Junfeng Li  <junfeng.li.1979@xxxxxxxx>
Date:    Thu, 11 Aug 2022 13:15:12 +0800

--000000000000b2d6d505e5f0407f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Dear Dick, Thanks a lot for your information. Yeah, the main problem for us is the limitation of the 16kHz sampling frequency at the output side. Therefore, even if we do bandwidth extension for input signal, we have to downsample to 16kHz after 3D rendering processing. I am wondering there is any possible/potential method using some pychoacoustic principle, like that? Thanks again. Best regards Junfeng On Thu, Aug 11, 2022 at 12:29 PM Richard F. Lyon <dicklyon@xxxxxxxx> wrote: > You could do "bandwidth extension" on the signals you want to spatialize, > e.g. with some of the methods at > https://gfx.cs.princeton.edu/pubs/Su_2021_BEI/ICASSP2021_Su_Wang_BWE.pdf > and then apply the high-sample-rate HRTFs. > Of course, if your system has a 16 ksps limitation on the output side, > that will be of no use. > > Dick > > > On Wed, Aug 10, 2022 at 9:22 PM Junfeng Li <junfeng.li.1979@xxxxxxxx> > wrote: > >> Dear all, >> >> We are working on 3D audio rendering for signals with low sampling >> frequency. >> As you may know, the HRTFs are normally measured at the high sampling >> frequency, e.g., 48kHz or 44.1kHz. However, the sampling frequency of so= und >> signals in our application is restricted to 16 kHz. Therefore, to render >> this low-frequency (=E2=89=A48kHz) signal, one straight way is to first = downsample >> the HRTFs from 48kHz/44.1kHz to 16kHz and then convolve with sound signa= ls. >> However, the sound localization performance of the signal rendered with >> this approach is greatly decreased, especially elevation perception. To >> improve the sound localization performance, I am now wondering whether >> there is a certain good method to solve or mitigate this problem in this >> scenario. >> >> Any discussion is welcome. >> >> Thanks a lot again. >> >> Best regards, >> Junfeng >> > --000000000000b2d6d505e5f0407f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">Dear Dick,=C2=A0<div><br></div><div>Thanks a lot for your = information.</div><div><br></div><div>Yeah, the main problem for us is the = limitation of the 16kHz sampling frequency at the output side. Therefore, e= ven if we do bandwidth extension for input signal, we have to downsample to= 16kHz after 3D rendering processing. I am wondering there is any possible/= potential method using some pychoacoustic principle, like that?</div><div><= br></div><div>Thanks again.</div><div><br></div><div>Best regards</div><div= >Junfeng=C2=A0</div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" c= lass=3D"gmail_attr">On Thu, Aug 11, 2022 at 12:29 PM Richard F. Lyon &lt;<a= href=3D"mailto:dicklyon@xxxxxxxx">dicklyon@xxxxxxxx</a>&gt; wrote:<br></div>= <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div cla= ss=3D"gmail_default" style=3D"font-size:small">You could do &quot;bandwidth= extension&quot; on the signals you want to spatialize, e.g. with some of t= he methods at <br></div><div class=3D"gmail_default" style=3D"font-size:sma= ll"><a href=3D"https://gfx.cs.princeton.edu/pubs/Su_2021_BEI/ICASSP2021_Su_= Wang_BWE.pdf" target=3D"_blank">https://gfx.cs.princeton.edu/pubs/Su_2021_B= EI/ICASSP2021_Su_Wang_BWE.pdf</a></div><div class=3D"gmail_default" style= =3D"font-size:small">and then apply the high-sample-rate HRTFs.=C2=A0 <br><= /div><div class=3D"gmail_default" style=3D"font-size:small">Of course, if y= our system has a 16 ksps limitation on the output side, that will be of no = use.<br></div><div class=3D"gmail_default" style=3D"font-size:small"><br></= div><div class=3D"gmail_default" style=3D"font-size:small">Dick</div><div c= lass=3D"gmail_default" style=3D"font-size:small"><br></div></div><br><div c= lass=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Aug 10, = 2022 at 9:22 PM Junfeng Li &lt;<a href=3D"mailto:junfeng.li.1979@xxxxxxxx"= target=3D"_blank">junfeng.li.1979@xxxxxxxx</a>&gt; wrote:<br></div><block= quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1= px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Dear all,=C2= =A0<div><br></div><div>We are working on 3D audio rendering for signals wit= h low sampling frequency.=C2=A0</div><div>As you may know, the HRTFs=C2=A0 = are normally measured at the high sampling frequency, e.g., 48kHz or 44.1kH= z. However, the sampling frequency of sound signals in our application=C2= =A0is restricted to 16 kHz. Therefore, to render this low-frequency (=E2=89= =A48kHz) signal, one straight way is to first=C2=A0downsample the HRTFs fro= m 48kHz/44.1kHz to 16kHz and then=C2=A0convolve with sound signals. However= , the sound localization performance of the signal rendered=C2=A0with this = approach is greatly decreased, especially elevation perception. To improve = the=C2=A0sound localization performance, I am now wondering whether there i= s a certain good method to solve or mitigate this problem in this scenario.= =C2=A0</div><div><br></div><div>Any discussion is welcome.</div><div><br></= div><div>Thanks a lot again.</div><div><br></div><div>Best regards,</div><d= iv>Junfeng=C2=A0</div></div> </blockquote></div> </blockquote></div> --000000000000b2d6d505e5f0407f--


This message came from the mail archive
src/postings/2022/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University