Re: [AUDITORY] converting masking thresholds to masker levels of speech sounds (Frederico Pereira )


Subject: Re: [AUDITORY] converting masking thresholds to masker levels of speech sounds
From:    Frederico Pereira  <pereira.frederico@xxxxxxxx>
Date:    Wed, 29 Jan 2020 19:26:41 +0000
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--000000000000b46cf9059d4c5232 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Mengli, No I haven=C2=B4t tried other models nor did I account for temporal models = at this point. Having temporal effects integrated in the psycouacoustic model would be something quite unique from what I=C2=B4ve seen from existing rout= ines yes! regards, Frederico On Wed, Jan 29, 2020 at 1:08 PM Feng, Mengli (2018) < Mengli.Feng.2018@xxxxxxxx> wrote: > Hi Frederico, > > Thanks very much for the code! > > I did the same thing using ISO psychoacoustic model 2. Was thinking about > using models to account for temporal effect. Have you tried more advanced > auditory models? > > Best wishes, > Mengli > > -- > Mengli Feng > PhD Student > PGR Collective EPMS School Convenor > > Audio, Biosignals and Machine Learning Group > Department of Electronic Engineering > Royal Holloway, University of London > > Research Interest: > Speech/ voice production and perception > Ongoing Project: > the perceptual effect of Bone-conducted sound of own voice > >>> Pure Page > > ------------------------------ > *From:* Frederico Pereira <pereira.frederico@xxxxxxxx> > *Sent:* Wednesday, January 29, 2020 12:15 pm > *To:* Feng, Mengli (2018) > *Cc:* AUDITORY@xxxxxxxx > *Subject:* Re: [AUDITORY] converting masking thresholds to masker levels > of speech sounds > > Hi Mengli, > > I=C2=B4m currently working on something similar and I=C2=B4ve been develo= ping on top > of the code and psychoacoustic models based on: > *ISO/IEC 11172-3:1993, Information technology =E2=80=93 Coding of moving = pictures > and associated audio for digital storage media at up to about 1,5 Mbit/s = =E2=80=93 > Part 3: Audio* > > https://ieeexplore.ieee.org/abstract/document/1296956 > and Matlab code provided by: > https://www.petitcolas.net/fabien/software/mpeg/#references > > Hoping this is of some help to you. > > regards, > > Frederico > > On Tue, Jan 28, 2020 at 5:19 AM Feng, Mengli (2018) < > Mengli.Feng.2018@xxxxxxxx> wrote: > >> Dear All, >> >> I am trying to convert masking curves into the frequency responses of th= e >> original maskers (single speech sounds). The maskees I am using are narr= ow >> band noises at different frequencies. >> >> It has taken me enormous effort to find an auditory model to make >> accurate predictions, considering the maskers are complex tones with >> multiple harmonics in high frequency region. Might anyone provide some >> guidance or advice on finding a suitable model? >> >> Is it even possible to do such prediction knowing only the frequency >> responses of the maskees and the masking thresholds given that temporal >> effects would inevitably appear because of the higher harmonics in human >> speech sounds? Any opinions? >> >> Any suggestion would be greatly appreciated! >> >> Best Regards, >> Mengli >> >> >> -- >> Mengli Feng >> PhD Student >> PGR Collective EPMS School Convenor >> >> Audio, Biosignals and Machine Learning Group >> Department of Electronic Engineering >> Royal Holloway, University of London >> >> Research Interest: >> Speech/ voice production and perception >> Ongoing Project: >> the perceptual effect of Bone-conducted sound of own voice >> >>> Pure Page >> >> > > -- > Frederico Pereira > Mobile:+351937356301 > Email:pereira.frederico@xxxxxxxx > --=20 Frederico Pereira Mobile:+61409066693 Email:pereira.frederico@xxxxxxxx --000000000000b46cf9059d4c5232 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div>Hi Mengli,</div><div>No I haven=C2=B4t tried other mo= dels nor did I account for temporal models at this point. Having temporal e= ffects integrated in the psycouacoustic model would be something quite uniq= ue from what I=C2=B4ve seen from existing routines yes!=C2=A0</div><div><br= ></div><div><br></div><div>regards,</div><div><br></div><div>Frederico</div= ></div><br><div class=3D"gmail_quote"><div class=3D"gmail_attr" dir=3D"ltr"= >On Wed, Jan 29, 2020 at 1:08 PM Feng, Mengli (2018) &lt;<a href=3D"mailto:= Mengli.Feng.2018@xxxxxxxx">Mengli.Feng.2018@xxxxxxxx</a>&gt; = wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0= px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-wi= dth:1px;border-left-style:solid"> <div> <div dir=3D"ltr"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div></div> <div> <div>Hi Frederico,</div> <div dir=3D"ltr"><br> </div> <div dir=3D"ltr">Thanks very much for the code!</div> <div dir=3D"ltr"><br> </div> <div dir=3D"ltr">I did the same thing using ISO psychoacoustic model 2. Was= thinking about using models to account for temporal effect. Have you tried= more advanced auditory models?</div> <div dir=3D"ltr"><br> </div> <div dir=3D"ltr">Best wishes,</div> <div dir=3D"ltr">Mengli</div> <div><br> </div> <div id=3D"gmail-m_7732870121685138551ms-outlook-mobile-signature"> <div style=3D"direction:ltr">--=C2=A0</div> <div style=3D"direction:ltr">Mengli Feng</div> <div style=3D"direction:ltr">PhD Student </div> <div style=3D"direction:ltr">PGR Collective=C2=A0EPMS=C2=A0School Convenor<= /div> <div style=3D"direction:ltr">=C2=A0</div> <div style=3D"direction:ltr">Audio, Biosignals and Machine Learning Group</= div> <div style=3D"direction:ltr">Department of Electronic Engineering</div> <div style=3D"direction:ltr">Royal Holloway, University of London=C2=A0</di= v> <div style=3D"direction:ltr">=C2=A0</div> <div style=3D"direction:ltr">Research Interest: </div> <div style=3D"direction:ltr">Speech/ voice production and perception</div> <div style=3D"direction:ltr">Ongoing Project: </div> <div style=3D"direction:ltr">the perceptual effect of Bone-conducted sound = of own voice </div> <div style=3D"direction:ltr">&gt;&gt;&gt; Pure Page </div> <div dir=3D"ltr"><br> </div> </div> </div> <div id=3D"gmail-m_7732870121685138551id-2bc1e381-9e25-4e39-ac35-b1c3f314cf= 57"> <hr style=3D"width:98%;color:rgb(0,0,0);font-family:-webkit-standard;font-s= ize:12pt;display:inline-block"> <div id=3D"gmail-m_7732870121685138551divRplyFwdMsg" dir=3D"ltr"><font face= =3D"Calibri, sans-serif"><b>From:</b> Frederico Pereira &lt;<a href=3D"mail= to:pereira.frederico@xxxxxxxx" target=3D"_blank">pereira.frederico@xxxxxxxx= om</a>&gt;<br> <b>Sent:</b> Wednesday, January 29, 2020 12:15 pm<br> <b>To:</b> Feng, Mengli (2018)<br> <b>Cc:</b> <a href=3D"mailto:AUDITORY@xxxxxxxx" target=3D"_blank">AU= DITORY@xxxxxxxx</a><br> <b>Subject:</b> Re: [AUDITORY] converting masking thresholds to masker leve= ls of speech sounds <div>=C2=A0</div> </font></div> <div dir=3D"ltr"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div>Hi Mengli,</div> <div><br> </div> <div>I=C2=B4m currently working on something similar and I=C2=B4ve been dev= eloping on top of the code and psychoacoustic models based on:</div> <div><i>ISO/IEC 11172-3:1993, Information technology =E2=80=93 Coding of mo= ving pictures and associated audio for digital storage media at up to about= 1,5 Mbit/s =E2=80=93 Part 3: Audio</i></div> <div> <p><span lang=3D"PT"><a href=3D"https://ieeexplore.ieee.org/abstract/docume= nt/1296956" target=3D"_blank">https://ieeexplore.ieee.org/abstract/document= /1296956</a></span></p> <b></b><i></i><u></u><sub></sub><sup></sup><strike></strike>and Matlab code= provided by:<br> </div> <div><a href=3D"https://www.petitcolas.net/fabien/software/mpeg/#references= " target=3D"_blank">https://www.petitcolas.net/fabien/software/mpeg/#refere= nces</a></div> <div><br> </div> <div>Hoping this is of some help to you.</div> <div><br> </div> <div>regards,</div> <div><br> </div> <div>Frederico</div> </div> </div> </div> </div> <br> <div class=3D"gmail_quote"> <div class=3D"gmail_attr" dir=3D"ltr">On Tue, Jan 28, 2020 at 5:19 AM Feng,= Mengli (2018) &lt;<a href=3D"mailto:Mengli.Feng.2018@xxxxxxxx" targ= et=3D"_blank">Mengli.Feng.2018@xxxxxxxx</a>&gt; wrote:<br> </div> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;padding= -left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-l= eft-style:solid"> <div> <div dir=3D"ltr"> <div dir=3D"ltr"> <div></div> <div> <div>Dear All,</div> <div>=C2=A0</div> <div>I am trying to convert masking curves into the frequency responses of = the original maskers (single speech sounds). The maskees I am using are nar= row band noises at different frequencies.</div> <div>=C2=A0</div> <div>It has taken me enormous effort to find an auditory model to make accu= rate predictions, considering the maskers are complex tones with multiple h= armonics in high frequency region. Might anyone provide some guidance or ad= vice on finding a suitable model?=C2=A0</div> <div>=C2=A0</div> <div>Is it even possible to do such prediction knowing only the frequency r= esponses of the maskees and the masking thresholds given that temporal effe= cts would inevitably appear because of the higher harmonics in human speech= sounds? Any opinions?</div> <div>=C2=A0</div> <div>Any suggestion would be greatly appreciated!</div> <div>=C2=A0</div> <div>Best Regards,</div> <div>Mengli</div> <div>=C2=A0</div> <div><br> </div> <div id=3D"gmail-m_7732870121685138551gmail-m_8380058477085050702ms-outlook= -mobile-signature"> <div style=3D"direction:ltr">--=C2=A0</div> <div style=3D"direction:ltr">Mengli Feng</div> <div style=3D"direction:ltr">PhD Student </div> <div style=3D"direction:ltr">PGR Collective=C2=A0EPMS=C2=A0School Convenor<= /div> <div style=3D"direction:ltr">=C2=A0</div> <div style=3D"direction:ltr">Audio, Biosignals and Machine Learning Group</= div> <div style=3D"direction:ltr">Department of Electronic Engineering</div> <div style=3D"direction:ltr">Royal Holloway, University of London=C2=A0</di= v> <div style=3D"direction:ltr">=C2=A0</div> <div style=3D"direction:ltr">Research Interest: </div> <div style=3D"direction:ltr">Speech/ voice production and perception</div> <div style=3D"direction:ltr">Ongoing Project: </div> <div style=3D"direction:ltr">the perceptual effect of Bone-conducted sound = of own voice</div> <div style=3D"direction:ltr">&gt;&gt;&gt; Pure Page </div> <div dir=3D"ltr"><br> </div> </div> </div> </div> </div> </div> </blockquote> </div> <br clear=3D"all"> <br> -- <br> <div dir=3D"ltr">Frederico Pereira<br> Mobile:+351937356301<br> <a href=3D"mailto:Email%3Apereira.frederico@xxxxxxxx" target=3D"_blank">Em= ail:pereira.frederico@xxxxxxxx</a></div> </div> </div> </div> </div> </div> </div> </blockquote></div><br clear=3D"all"><br>-- <br><div class=3D"gmail_signatu= re" dir=3D"ltr">Frederico Pereira<br>Mobile:+61409066693<br><a href=3D"mail= to:Email%3Apereira.frederico@xxxxxxxx" target=3D"_blank">Email:pereira.fre= derico@xxxxxxxx</a></div> --000000000000b46cf9059d4c5232--


This message came from the mail archive
src/postings/2020/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University