Re: [AUDITORY] Gammatone filter bank in MATLABr2019a (Jihad Ibrahim )


Subject: Re: [AUDITORY] Gammatone filter bank in MATLABr2019a
From:    Jihad Ibrahim  <jibrahim@xxxxxxxx>
Date:    Mon, 20 May 2019 16:25:00 +0000
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--_000_BL0PR05MB52828C7D66438377AE363151C9060BL0PR05MB5282namp_ Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Hi all, I am a developer in Audio Toolbox at MathWorks, and just wanted to let ever= yone know that we are capturing your comments regarding new R2019a releases= and really appreciate your feedback. It will take us some time to digest this feedback and convert it into user-= visible changes, but I thought I'd share a few notes in the meantime: * Regarding Bastian Epp's initial post, he is right to point out that t= he image might be misleading and interpreted to indicate an equivalence bet= ween the cochlea and the gammatone filter bank. We will aim to remove the i= mage of the basilar membrane in the next release to help avoid that incorre= ct interpretation. * Regarding Richard F. Lyon's post: The confusion here is due to an amb= iguously worded sentence. The gammatone filter bank implemented in Audio To= olbox followed the algorithm described in [1] (Slaney). [1] says the algori= thm is an implementation of an idea proposed by [2] (Patterson et al). [2] = is in general a good primer for understanding [1], which is why we thought = it was good to reference. We think we should reword this more carefully. * The formula stating that the bandwidth is 1.019*erb2hz(fc) does indee= d have a typo. We will fix this ASAP starting from the online documentation= . * Regarding the limited parametrizations of the function(s): So far, Au= dio Toolbox has focused on providing simple and fast implementations of fea= ture extractors. The idea is to find a balance between an expert in auditor= y science and someone looking to build a machine learning or deep learning = application. That being said, if exposing more parameters would enable more= workflows, then we would definitely consider adding more options on the fu= nctions. We plan to investigate alternative options and we may try to reach= out to some of those who commented on this for additional feedback. * We agree that the cubic root is a very common implementation of GTCC= . We will investigate offering the option of using a cubic root in the nonl= inear rectification stage )along with the log option, which is used as well= ). Rabiner and Schafer are referenced because the computation of the deltas= is implemented based on Theory and Applications of Digital Speech Processi= ng. * Regarding Volker Hohmanns' note on the re-synthesis method being non-= optimal: The intention of the example was to showcase a straightforward and= simple usage of the object rather than demonstrate how to best achieve rec= onstruction. We agree that the showcased method is not optimal, and we will= reword the example to clarify this. We will also consider adding an optima= l reconstruction example based on Dr. Hohmanns' paper Regards, Jihad Ibrahim Developer, Audio Toolbox, MathWorks --_000_BL0PR05MB52828C7D66438377AE363151C9060BL0PR05MB5282namp_ Content-Type: text/html; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr= osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:= //www.w3.org/TR/REC-html40"> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"= > <meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)"> <style><!-- /* Font Definitions */ @xxxxxxxx =09{font-family:Wingdings; =09panose-1:5 0 0 0 0 0 0 0 0 0;} @xxxxxxxx =09{font-family:"Cambria Math"; =09panose-1:2 4 5 3 5 4 6 3 2 4;} @xxxxxxxx =09{font-family:Calibri; =09panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal =09{margin:0in; =09margin-bottom:.0001pt; =09font-size:11.0pt; =09font-family:"Calibri",sans-serif;} a:link, span.MsoHyperlink =09{mso-style-priority:99; =09color:#0563C1; =09text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed =09{mso-style-priority:99; =09color:#954F72; =09text-decoration:underline;} span.EmailStyle17 =09{mso-style-type:personal-compose; =09font-family:"Calibri",sans-serif; =09color:windowtext;} .MsoChpDefault =09{mso-style-type:export-only; =09font-family:"Calibri",sans-serif;} @xxxxxxxx WordSection1 =09{size:8.5in 11.0in; =09margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 =09{page:WordSection1;} /* List Definitions */ @xxxxxxxx l0 =09{mso-list-id:27924296; =09mso-list-type:hybrid; =09mso-list-template-ids:-796901762 67698689 67698691 67698693 67698689 676= 98691 67698693 67698689 67698691 67698693;} @xxxxxxxx l0:level1 =09{mso-level-number-format:bullet; =09mso-level-text:\F0B7; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:Symbol;} @xxxxxxxx l0:level2 =09{mso-level-number-format:bullet; =09mso-level-text:o; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:"Courier New";} @xxxxxxxx l0:level3 =09{mso-level-number-format:bullet; =09mso-level-text:\F0A7; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:Wingdings;} @xxxxxxxx l0:level4 =09{mso-level-number-format:bullet; =09mso-level-text:\F0B7; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:Symbol;} @xxxxxxxx l0:level5 =09{mso-level-number-format:bullet; =09mso-level-text:o; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:"Courier New";} @xxxxxxxx l0:level6 =09{mso-level-number-format:bullet; =09mso-level-text:\F0A7; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:Wingdings;} @xxxxxxxx l0:level7 =09{mso-level-number-format:bullet; =09mso-level-text:\F0B7; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:Symbol;} @xxxxxxxx l0:level8 =09{mso-level-number-format:bullet; =09mso-level-text:o; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:"Courier New";} @xxxxxxxx l0:level9 =09{mso-level-number-format:bullet; =09mso-level-text:\F0A7; =09mso-level-tab-stop:none; =09mso-level-number-position:left; =09text-indent:-.25in; =09font-family:Wingdings;} ol =09{margin-bottom:0in;} ul =09{margin-bottom:0in;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--> </head> <body lang=3D"EN-US" link=3D"#0563C1" vlink=3D"#954F72"> <div class=3D"WordSection1"> <p class=3D"MsoNormal">Hi all,<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">I am a developer in Audio Toolbox at MathWorks, and = just wanted to let everyone know that we are capturing your comments regard= ing new R2019a releases and really appreciate your feedback. <o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">It will take us some time to digest this feedback an= d convert it into user-visible changes, but I thought I&#8217;d share a few= notes in the meantime:<o:p></o:p></p> <ul style=3D"margin-top:0in" type=3D"disc"> <li class=3D"MsoNormal" style=3D"color:black;mso-list:l0 level1 lfo1"><span= style=3D"color:windowtext">Regarding Bastian Epp&#8217;s initial post, he = is right to point out that the image </span>might be misleading and interpreted to indicate an equivalence betwe= en the cochlea and the gammatone filter bank. We will aim to remove the ima= ge of the basilar membrane in the next release to help avoid that incorrect= interpretation.<o:p></o:p></li><li class=3D"MsoNormal" style=3D"color:blac= k;mso-list:l0 level1 lfo1">Regarding Richard F. Lyon&#8217;s post: The conf= usion here is due to an ambiguously worded sentence. The gammatone filter b= ank implemented in Audio Toolbox followed the algorithm described in [1] (S= laney). [1] says the algorithm is an implementation of an idea proposed by [2] (Pa= tterson et al). [2] is in general a good primer for understanding [1], whic= h is why we thought it was good to reference. We think we should reword thi= s more carefully.<o:p></o:p></li><li class=3D"MsoNormal" style=3D"mso-list:= l0 level1 lfo1">The formula stating that the bandwidth is <span style=3D"color:black">1.019*erb2hz(fc) does indeed have a typo. We wi= ll fix this ASAP starting from the online documentation. </span><o:p></o:p></li><li class=3D"MsoNormal" style=3D"mso-list:l0 level1 = lfo1">Regarding the limited p<span style=3D"color:black">arametrizations of= the function(s): So far, Audio Toolbox has focused on providing simple and= fast implementations of feature extractors. The idea is to find a balance between an expert in auditory science and someone looking to bui= ld a machine learning or deep learning application. That being said, if exp= osing more parameters would enable more workflows, then we would definitely= consider adding more options on the functions. We plan to investigate alternative options and we may try t= o reach out to some of those who commented on this for additional feedback<= /span><span lang=3D"EN-GB" style=3D"color:black">. </span><o:p></o:p></li><li class=3D"MsoNormal" style=3D"color:black;mso-lis= t:l0 level1 lfo1">&nbsp;We agree that the cubic root is a very common imple= mentation of GTCC. We will investigate offering the option of using a cubic= root in the nonlinear rectification stage )along with the log option, which is used as well). Rabiner and Schafer are referenced because the com= putation of the deltas is implemented based on <span style=3D"color:#404040;background:white">Theory and Applications of D= igital Speech Processing.</span><o:p></o:p></li><li class=3D"MsoNormal" sty= le=3D"color:black;mso-list:l0 level1 lfo1"><span style=3D"color:windowtext"= >Regarding </span>Volker Hohmanns&#8217; note on the re-synthesis method being non-opt= imal: The intention of the example was to showcase a straightforward and si= mple usage of the object rather than demonstrate how to best achieve recons= truction. We agree that the showcased method is not optimal, and we will reword the example to clarify this. We = will also consider adding an optimal reconstruction example based on Dr. Ho= hmanns&#8217; paper<o:p></o:p></li></ul> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Regards,<o:p></o:p></p> <p class=3D"MsoNormal">Jihad Ibrahim<o:p></o:p></p> <p class=3D"MsoNormal">Developer, Audio Toolbox, MathWorks<o:p></o:p></p> </div> </body> </html> --_000_BL0PR05MB52828C7D66438377AE363151C9060BL0PR05MB5282namp_--


This message came from the mail archive
src/postings/2019/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University