Re: About importance of "phase" in sound recognition (James Johnston )


Subject: Re: About importance of "phase" in sound recognition
From:    James Johnston  <James.Johnston@xxxxxxxx>
Date:    Mon, 11 Oct 2010 11:13:26 -0700
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--_000_A3A915D4968D3547B15269C244643B620A6F4A6E28EXCHANGE2K7dt_ Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="ISO-8859-1" Re: Below, while I think a frequency tiling that is somewhat like a wavelet= is quite appropriate, in fact something more like a tiling of .7 Bark at l= ow frequencies and 1 ERB at higher frequencies, with attention paid to the = slope of the filter skirts, would be ideal, I don't think a wavelet is idea= for auditory analysis, since the ear consists of a set of heavily overlapp= ed, far from 1:1 and onto "bands". I know from working on loudness models = that you must have a filter at each 1/3 ERB (at higher frequencies, let's s= tick to something like .7 bark at low, please) with the appropriate respons= e CENTERED on that frequency. Using minimum-phase seems ok for this. But, with wavelet transforms, you're going to have 1:1 performance, and spa= cing accordingly, with critical sampling properties. This is not going to h= andle edges of bands very well in my experience. In particular, I'm objecting to the 1:1 and onto properties of the wavelet,= they do not match how the ear works. Of course, if you must do exact rec= onstruction, that's a different issue. __________________________ James D. Johnston (jj@xxxxxxxx) CHIEF SCIENTIST - DTS, Inc. 425-522-0632 - office 425-814-3204 - fax 206-321-7449- mobile 11410 NE 122nd Way, Suite 100 Kirkland, WA 98034 This electronic transmission (and/or the documents accompanying it) may con= tain confidential and privileged information. Any unauthorized use, copying= or distribution is prohibited. If you have received this communication in= error, please notify DTS, Inc immediately by telephone (425-814-3200) and = destroy the original message. Messages sent to and from us may be monitored. Those interested in the mathematical basis of phase perception might like t= o look at a paper by Martin Reimann that appeared in JASA a few years ago. = After demonstrating that the cochlea preforms a wavelet transform rather th= an a windowed Fourier transform, he goes on to describe how phase operates = in the wavelet representation of auditory processing. Notice: This message and any included attachments are intended only for the use of = the addressee, and may contain information that is privileged or confidenti= al. If you are not the intended recipient, you are hereby notified that any= dissemination, distribution or copying of this communication is strictly p= rohibited. If you have received this communication in error, please destroy= the original message and any copies or printouts hereof. --_000_A3A915D4968D3547B15269C244643B620A6F4A6E28EXCHANGE2K7dt_ Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset="ISO-8859-1" <html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr= osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:x=3D"urn:schemas-microsoft-com:office:excel" xmlns:p=3D"urn:schemas-m= icrosoft-com:office:powerpoint" xmlns:a=3D"urn:schemas-microsoft-com:office= :access" xmlns:dt=3D"uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:s=3D"= uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:rs=3D"urn:schemas-microsof= t-com:rowset" xmlns:z=3D"#RowsetSchema" xmlns:b=3D"urn:schemas-microsoft-co= m:office:publisher" xmlns:ss=3D"urn:schemas-microsoft-com:office:spreadshee= t" xmlns:c=3D"urn:schemas-microsoft-com:office:component:spreadsheet" xmlns= :odc=3D"urn:schemas-microsoft-com:office:odc" xmlns:oa=3D"urn:schemas-micro= soft-com:office:activation" xmlns:html=3D"http://www.w3.org/TR/REC-html40" = xmlns:q=3D"http://schemas.xmlsoap.org/soap/envelope/" xmlns:rtc=3D"http://m= icrosoft.com/officenet/conferencing" xmlns:D=3D"DAV:" xmlns:Repl=3D"http://= schemas.microsoft.com/repl/" xmlns:mt=3D"http://schemas.microsoft.com/share= point/soap/meetings/" xmlns:x2=3D"http://schemas.microsoft.com/office/excel= /2003/xml" xmlns:ppda=3D"http://www.passport.com/NameSpace.xsd" xmlns:ois= =3D"http://schemas.microsoft.com/sharepoint/soap/ois/" xmlns:dir=3D"http://= schemas.microsoft.com/sharepoint/soap/directory/" xmlns:ds=3D"http://www.w3= .org/2000/09/xmldsig#" xmlns:dsp=3D"http://schemas.microsoft.com/sharepoint= /dsp" xmlns:udc=3D"http://schemas.microsoft.com/data/udc" xmlns:xsd=3D"http= ://www.w3.org/2001/XMLSchema" xmlns:sub=3D"http://schemas.microsoft.com/sha= repoint/soap/2002/1/alerts/" xmlns:ec=3D"http://www.w3.org/2001/04/xmlenc#"= xmlns:sp=3D"http://schemas.microsoft.com/sharepoint/" xmlns:sps=3D"http://= schemas.microsoft.com/sharepoint/soap/" xmlns:xsi=3D"http://www.w3.org/2001= /XMLSchema-instance" xmlns:udcs=3D"http://schemas.microsoft.com/data/udc/so= ap" xmlns:udcxf=3D"http://schemas.microsoft.com/data/udc/xmlfile" xmlns:udc= p2p=3D"http://schemas.microsoft.com/data/udc/parttopart" xmlns:wf=3D"http:/= /schemas.microsoft.com/sharepoint/soap/workflow/" xmlns:dsss=3D"http://sche= mas.microsoft.com/office/2006/digsig-setup" xmlns:dssi=3D"http://schemas.mi= crosoft.com/office/2006/digsig" xmlns:mdssi=3D"http://schemas.openxmlformat= s.org/package/2006/digital-signature" xmlns:mver=3D"http://schemas.openxmlf= ormats.org/markup-compatibility/2006" xmlns:m=3D"http://schemas.microsoft.c= om/office/2004/12/omml" xmlns:mrels=3D"http://schemas.openxmlformats.org/pa= ckage/2006/relationships" xmlns:spwp=3D"http://microsoft.com/sharepoint/web= partpages" xmlns:ex12t=3D"http://schemas.microsoft.com/exchange/services/20= 06/types" xmlns:ex12m=3D"http://schemas.microsoft.com/exchange/services/200= 6/messages" xmlns:pptsl=3D"http://schemas.microsoft.com/sharepoint/soap/Sli= deLibrary/" xmlns:spsl=3D"http://microsoft.com/webservices/SharePointPortal= Server/PublishedLinksService" xmlns:Z=3D"urn:schemas-microsoft-com:" xmlns:= st=3D"&#1;" xmlns=3D"http://www.w3.org/TR/REC-html40"> <head> <meta http-equiv=3DContent-Type content=3D"text/html; charset=3Dus-ascii"> <meta name=3DGenerator content=3D"Microsoft Word 12 (filtered medium)"> <style> <!-- /* Font Definitions */ @xxxxxxxx {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} @xxxxxxxx {font-family:Verdana; panose-1:2 11 6 4 3 5 4 4 2 4;} @xxxxxxxx {font-family:Consolas; panose-1:2 11 6 9 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:12.0pt; font-family:"Times New Roman","serif"; color:black;} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} pre {mso-style-priority:99; mso-style-link:"HTML Preformatted Char"; margin:0in; margin-bottom:.0001pt; font-size:10.0pt; font-family:"Courier New"; color:black;} span.HTMLPreformattedChar {mso-style-name:"HTML Preformatted Char"; mso-style-priority:99; mso-style-link:"HTML Preformatted"; font-family:Consolas; color:black;} span.EmailStyle21 {mso-style-type:personal-reply; font-family:"Calibri","sans-serif"; color:#1F497D;} .MsoChpDefault {mso-style-type:export-only; font-size:10.0pt;} @xxxxxxxx WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} --> </style> <!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--> </head> <body bgcolor=3Dwhite lang=3DEN-US link=3Dblue vlink=3Dpurple> <div class=3DWordSection1> <p class=3DMsoNormal><span style=3D'font-size:11.0pt;font-family:"Calibri",= "sans-serif"; color:#1F497D'>Re: Below, while I think a frequency tiling that is somewhat like a wavelet is quite appropriate, in fact something more like a tiling o= f .7 Bark at low frequencies and 1 ERB at higher frequencies, with attention pai= d to the slope of the filter skirts, would be ideal, I don&#8217;t think a wavel= et is idea for auditory analysis, since the ear consists of a set of heavily overlapped, far from 1:1 and onto &#8220;bands&#8221;.&nbsp; I know from working on loudness models that you must have a filter at each 1/3 ERB (at higher frequencies, let&#8217;s stick to something like .7 bark at low, ple= ase) with the appropriate response CENTERED on that frequency. Using minimum-pha= se seems ok for this.<o:p></o:p></span></p> <p class=3DMsoNormal><span style=3D'font-size:11.0pt;font-family:"Calibri",= "sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span style=3D'font-size:11.0pt;font-family:"Calibri",= "sans-serif"; color:#1F497D'>But, with wavelet transforms, you&#8217;re going to have 1:1 performance, and spacing accordingly, with critical sampling properties. Th= is is not going to handle edges of bands very well in my experience.<o:p></o:p= ></span></p> <p class=3DMsoNormal><span style=3D'font-size:11.0pt;font-family:"Calibri",= "sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span style=3D'font-size:11.0pt;font-family:"Calibri",= "sans-serif"; color:#1F497D'>In particular, I&#8217;m objecting to the 1:1 and onto properties of the wavelet, they do not match how the ear works.&nbsp;&nbsp;= Of course, if you must do exact reconstruction, that&#8217;s a different issue= .<o:p></o:p></span></p> <p class=3DMsoNormal><span style=3D'font-size:11.0pt;font-family:"Calibri",= "sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <div> <p class=3DMsoNormal style=3D'mso-margin-top-alt:auto;mso-margin-bottom-alt= :auto'><b><span style=3D'font-size:7.5pt;font-family:"Verdana","sans-serif";color:#1F497D'>= __________________________<o:p></o:p></span></b></p> <p class=3DMsoNormal style=3D'mso-margin-top-alt:auto;mso-margin-bottom-alt= :auto'><b><span style=3D'font-size:7.5pt;font-family:"Verdana","sans-serif";color:#1F497D'>= James D. Johnston&nbsp; (jj@xxxxxxxx)</span></b><b><span style=3D'font-size:7.5pt; font-family:"Verdana","sans-serif";color:#1F497D'><o:p></o:p></span></b></p> <p class=3DMsoNormal><span style=3D'font-size:7.5pt;font-family:"Verdana","= sans-serif"; color:#1F497D'>CHIEF SCIENTIST&nbsp;- DTS, Inc.<br> </span><span style=3D'font-family:"Verdana","sans-serif";color:#1F497D'><br> </span><span style=3D'font-size:7.5pt;font-family:"Verdana","sans-serif"; color:#1F497D'>425-522-0632 - office<br> 425-814-3204 - fax<br> 206-321-7449- mobile<br> <br> 11410 NE 122nd Way, &nbsp;Suite 100<br> Kirkland, WA 98034<br> </span><span style=3D'font-family:"Verdana","sans-serif";color:#1F497D'><br> </span><span style=3D'font-size:7.5pt;font-family:"Arial","sans-serif"; color:#1F497D'>This electronic transmission (and/or the documents accompany= ing it) may contain confidential and privileged information. Any unauthorized u= se, copying or distribution is prohibited.&nbsp; If you have received this communication in error, please notify DTS, Inc immediately by telephone (425-814-3200) and destroy the original message. Messages sent to and from = us may be monitored.</span><span style=3D'font-size:11.0pt;font-family:"Calibr= i","sans-serif"; color:#1F497D'><o:p></o:p></span></p> </div> <p class=3DMsoNormal><span style=3D'font-size:11.0pt;font-family:"Calibri",= "sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal>Those interested in the mathematical basis of phase perception might like to look at a paper by Martin Reimann that appeared in JASA a few years ago. After demonstrating that the cochlea preforms a wavel= et transform rather than a windowed Fourier transform, he goes on to describe how phase operates in the wavelet representation of auditory processing.<span style=3D'color:#1F497D'><o:p></o:p></span></p> <p class=3DMsoNormal><span style=3D'font-size:11.0pt;font-family:"Calibri",= "sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal>&nbsp;<o:p></o:p></p> </div> <DIV> Notice:<BR> This message and any included attachments are intended only for the use of = the addressee, and may contain information that is privileged or confidenti= al. If you are not the intended recipient, you are hereby notified that any= dissemination, distribution or copying of this communication is strictly p= rohibited. If you have received this communication in error, please destroy= the original message and any copies or printouts hereof.<BR> </DIV></body> </html> --_000_A3A915D4968D3547B15269C244643B620A6F4A6E28EXCHANGE2K7dt_--


This message came from the mail archive
/home/empire6/dpwe/public_html/postings/2010/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University