Re: [AUDITORY] Semantic McGurk Effect ("Patel, Aniruddh D." )


Subject: Re: [AUDITORY] Semantic McGurk Effect
From:    "Patel, Aniruddh D."  <a.patel@xxxxxxxx>
Date:    Sat, 8 Aug 2020 12:52:28 +0000

--_000_SN6PR05MB5231933651DFA70185FC8665E3460SN6PR05MB5231namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi all, Very interesting discussion, with great examples and references. Thanks Ma= lcolm for kicking this off. This is a favorite example of my Psychology of Music students, along these = lines: Original: https://www.youtube.com/watch?v=3DO5b7tgkdFH0 Updated: https://www.youtube.com/watch?v=3DnIwrgAnx6Q8 Best, Ani [https://www.bing.com/th?id=3DOVP.SxvxtFidHPxD-Rat9TU4ygHfFn&pid=3DApi]<htt= ps://www.youtube.com/watch?v=3DnIwrgAnx6Q8> O Fortuna Misheard Lyrics<https://www.youtube.com/watch?v=3DnIwrgAnx6Q8> O Fortuna Misheard Lyrics www.youtube.com [https://www.bing.com/th?id=3DOVP.nhFFaIYkodVUSSgvztJnDgHgFo&pid=3DApi]<htt= ps://www.youtube.com/watch?v=3DO5b7tgkdFH0> Carl Orff - O Fortuna - Latin and English Lyrics<https://www.youtube.com/wa= tch?v=3DO5b7tgkdFH0> Enjoy :) Artwork : http://zipansion.com/1Ygev Copyright Disclaimer: Copyrig= ht Disclaimer Under Section 107 of the Copryrigt Act of 1976, allowance is = made for "fair use" for purpose such as criticism, comment, news reporting,= teaching, scholarship, and research. Fair use is a use permitted by copyri= ght statute that might otherwise be infringing ... www.youtube.com ________________________________ From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxx>= on behalf of Arthur, Claire <claire.arthur@xxxxxxxx> Sent: Friday, August 7, 2020 3:14 PM To: AUDITORY@xxxxxxxx <AUDITORY@xxxxxxxx> Subject: Re: [AUDITORY] Semantic McGurk Effect I believe because of the odd timbre/background noise, our text priming help= s us choose what to parse as the "negative space". So, for example, you hea= r the "ee" from "needle" if you are trying to hear that word, whereas you s= imply ignore that "ee" as it falls in between "brain" and "storm" if you ar= e listening for brainstorm. Fun! Thanks for sharing. Claire Claire Arthur Assistant Professor, School of Music College of Design Georgia Institute of Technology (404) 894-9110 claire.arthur@xxxxxxxx ________________________________ From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxx>= on behalf of Prof. Roger K. Moore <0000011559506d60-dmarc-request@xxxxxxxx= GILL.CA> Sent: Friday, August 7, 2020 4:38 AM To: AUDITORY@xxxxxxxx <AUDITORY@xxxxxxxx> Subject: Re: [AUDITORY] Semantic McGurk Effect I must admit to being surprised by the surprise engendered by this video. = Anyone who was around during the early days of text-to-speech synthesis is = very aware of the danger of presenting the text in advance of or simultaneo= us with the generated speech. The intelligibility of the resulting synthes= is could be zero without the 'prior' and 100% with the visual cue. So, given that we know that perception involves the integration of top-down= expectations with bottom-up evidence (going right back to Richard Warren's= work on the 'phoneme restoration effect'), why is this TikTok demo surpris= ing? Or maybe I'm missing something? Best wishes Roger ---------------------------------------------------------------------------= ----------------- Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET Chair of Spoken Language Processing Vocal Interactivity Lab (VILab), Sheffield Robotics Speech & Hearing Research Group (SPandH) Department of Computer Science, UNIVERSITY OF SHEFFIELD Regent Court, 211 Portobello, Sheffield, S1 4DP, UK * Winner of the 2016 Antonio Zampolli Prize for "Outstanding Contributions to the Advancement of Language Resources & Language Technology Evaluation within Human Language Technologies" e-mail: r.k.moore@xxxxxxxx<mailto:r.k.moore@xxxxxxxx> web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/ twitter: @xxxxxxxx Tel: +44 (0) 11422 21807 Fax: +44 (0) 11422 21810 Mob: +44 (0) 7910 073631 Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE (http://www.journals.elsevier.com/computer-speech-and-language/) ---------------------------------------------------------------------------= ----------------- <http:///> On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx<mailto:malcol= m@xxxxxxxx>> wrote: Has there been anything formal published on this effect? https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tikto= k-audio-illusion It sounds to me like a semantic version of the McGurk effect. Nice demo. - Malcolm --_000_SN6PR05MB5231933651DFA70185FC8665E3460SN6PR05MB5231namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"= > <style type=3D"text/css" style=3D"display:none;"> P {margin-top:0;margin-bo= ttom:0;} </style> </head> <body dir=3D"ltr"> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> Hi all,</div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> <br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> Very interesting discussion, with great examples and references.&nbsp; Than= ks Malcolm for kicking this off.</div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> <br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> This is a favorite example of my Psychology of Music students, along these = lines:</div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> <br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> Original:&nbsp;<a href=3D"https://www.youtube.com/watch?v=3DO5b7tgkdFH0" id= =3D"LPlnkOWALinkPreview_0">https://www.youtube.com/watch?v=3DO5b7tgkdFH0</a= ></div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> <br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> Updated:&nbsp;<a href=3D"https://www.youtube.com/watch?v=3DnIwrgAnx6Q8" sty= le=3D"margin: 0px; font-family: Calibri, Arial, Helvetica, sans-serif; back= ground-color: rgb(255, 255, 255)" id=3D"LPlnkOWALinkPreview_1">https://www.= youtube.com/watch?v=3DnIwrgAnx6Q8</a></div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> <br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> Best,</div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> Ani</div> <div class=3D"_Entity _EType_OWALinkPreview _EId_OWALinkPreview_1 _EReadonl= y_1"> <div id=3D"LPBorder_BVTaHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g:dj1uSXdyZ0Fu= eDZROA.." class=3D"LPBorder504590" style=3D"width: 100%; margin-top: 16px; = margin-bottom: 16px; position: relative; max-width: 800px; min-width: 424px= ;"> <table id=3D"LPContainer504590" role=3D"presentation" style=3D"padding: 12p= x 36px 12px 12px; width: 100%; border-width: 1px; border-style: solid; bord= er-color: rgb(200, 200, 200); border-radius: 2px;"> <tbody> <tr valign=3D"top" style=3D"border-spacing: 0px;"> <td> <div id=3D"LPImageContainer504590" style=3D"position: relative; margin-righ= t: 12px; height: 179.875px; overflow: hidden; width: 240px;"> <a target=3D"_blank" id=3D"LPImageAnchor504590" href=3D"https://www.youtube= .com/watch?v=3DnIwrgAnx6Q8"><img id=3D"LPThumbnailImageId504590" alt=3D"" h= eight=3D"179" style=3D"display: block;" width=3D"240" src=3D"https://www.bi= ng.com/th?id=3DOVP.SxvxtFidHPxD-Rat9TU4ygHfFn&amp;pid=3DApi"></a></div> </td> <td style=3D"width: 100%;"> <div id=3D"LPTitle504590" style=3D"font-size: 21px; font-weight: 300; margi= n-right: 8px; font-family: wf_segoe-ui_light, &quot;Segoe UI Light&quot;, &= quot;Segoe WP Light&quot;, &quot;Segoe UI&quot;, &quot;Segoe WP&quot;, Taho= ma, Arial, sans-serif; margin-bottom: 12px;"> <a target=3D"_blank" id=3D"LPUrlAnchor504590" href=3D"https://www.youtube.c= om/watch?v=3DnIwrgAnx6Q8" style=3D"text-decoration: none; color: var(--them= ePrimary);">O Fortuna Misheard Lyrics</a></div> <div id=3D"LPDescription504590" style=3D"font-size: 14px; max-height: 100px= ; color: rgb(102, 102, 102); font-family: wf_segoe-ui_normal, &quot;Segoe U= I&quot;, &quot;Segoe WP&quot;, Tahoma, Arial, sans-serif; margin-bottom: 12= px; margin-right: 8px; overflow: hidden;"> O Fortuna Misheard Lyrics</div> <div id=3D"LPMetadata504590" style=3D"font-size: 14px; font-weight: 400; co= lor: rgb(166, 166, 166); font-family: wf_segoe-ui_normal, &quot;Segoe UI&qu= ot;, &quot;Segoe WP&quot;, Tahoma, Arial, sans-serif;"> www.youtube.com</div> </td> </tr> </tbody> </table> </div> </div> <br> <div class=3D"_Entity _EType_OWALinkPreview _EId_OWALinkPreview_0 _EReadonl= y_1"> <div id=3D"LPBorder_BVTaHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g:dj1PNWI3dGdr= ZEZIMA.." class=3D"LPBorder905463" style=3D"width: 100%; margin-top: 16px; = margin-bottom: 16px; position: relative; max-width: 800px; min-width: 424px= ;"> <table id=3D"LPContainer905463" role=3D"presentation" style=3D"padding: 12p= x 36px 12px 12px; width: 100%; border-width: 1px; border-style: solid; bord= er-color: rgb(200, 200, 200); border-radius: 2px;"> <tbody> <tr valign=3D"top" style=3D"border-spacing: 0px;"> <td> <div id=3D"LPImageContainer905463" style=3D"position: relative; margin-righ= t: 12px; height: 180px; overflow: hidden; width: 240px;"> <a target=3D"_blank" id=3D"LPImageAnchor905463" href=3D"https://www.youtube= .com/watch?v=3DO5b7tgkdFH0"><img id=3D"LPThumbnailImageId905463" alt=3D"" h= eight=3D"180" style=3D"display: block;" width=3D"240" src=3D"https://www.bi= ng.com/th?id=3DOVP.nhFFaIYkodVUSSgvztJnDgHgFo&amp;pid=3DApi"></a></div> </td> <td style=3D"width: 100%;"> <div id=3D"LPTitle905463" style=3D"font-size: 21px; font-weight: 300; margi= n-right: 8px; font-family: wf_segoe-ui_light, &quot;Segoe UI Light&quot;, &= quot;Segoe WP Light&quot;, &quot;Segoe UI&quot;, &quot;Segoe WP&quot;, Taho= ma, Arial, sans-serif; margin-bottom: 12px;"> <a target=3D"_blank" id=3D"LPUrlAnchor905463" href=3D"https://www.youtube.c= om/watch?v=3DO5b7tgkdFH0" style=3D"text-decoration: none; color: var(--them= ePrimary);">Carl Orff - O Fortuna - Latin and English Lyrics</a></div> <div id=3D"LPDescription905463" style=3D"font-size: 14px; max-height: 100px= ; color: rgb(102, 102, 102); font-family: wf_segoe-ui_normal, &quot;Segoe U= I&quot;, &quot;Segoe WP&quot;, Tahoma, Arial, sans-serif; margin-bottom: 12= px; margin-right: 8px; overflow: hidden;"> Enjoy :) Artwork : http://zipansion.com/1Ygev Copyright Disclaimer: Copyrig= ht Disclaimer Under Section 107 of the Copryrigt Act of 1976, allowance is = made for &quot;fair use&quot; for purpose such as criticism, comment, news = reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be i= nfringing ...</div> <div id=3D"LPMetadata905463" style=3D"font-size: 14px; font-weight: 400; co= lor: rgb(166, 166, 166); font-family: wf_segoe-ui_normal, &quot;Segoe UI&qu= ot;, &quot;Segoe WP&quot;, Tahoma, Arial, sans-serif;"> www.youtube.com</div> </td> </tr> </tbody> </table> </div> </div> <br> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> <br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> <br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0);"> <br> </div> <div id=3D"appendonsend"></div> <hr style=3D"display:inline-block;width:98%" tabindex=3D"-1"> <div id=3D"divRplyFwdMsg" dir=3D"ltr"><font face=3D"Calibri, sans-serif" st= yle=3D"font-size:11pt" color=3D"#000000"><b>From:</b> AUDITORY - Research i= n Auditory Perception &lt;AUDITORY@xxxxxxxx&gt; on behalf of Arthur,= Claire &lt;claire.arthur@xxxxxxxx&gt;<br> <b>Sent:</b> Friday, August 7, 2020 3:14 PM<br> <b>To:</b> AUDITORY@xxxxxxxx &lt;AUDITORY@xxxxxxxx&gt;<br> <b>Subject:</b> Re: [AUDITORY] Semantic McGurk Effect</font> <div>&nbsp;</div> </div> <style type=3D"text/css" style=3D"display:none"> <!-- p {margin-top:0; margin-bottom:0} --> </style> <div dir=3D"ltr"> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12p= t; color:rgb(0,0,0)"> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12p= t">I believe because of the odd timbre/background noise, our text priming h= elps us choose what to parse as the &quot;negative space&quot;. So, for exa= mple, you hear the &quot;ee&quot; from &quot;needle&quot; if you are trying to hear that word, whereas you simply ignore that &quot;ee&quot= ; as it falls in between &quot;brain&quot; and &quot;storm&quot; if you are= listening for brainstorm. Fun! Thanks for sharing.</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12p= t"><br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12p= t">Claire</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12p= t"><br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12p= t"><br> </div> <br> <div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12p= t; color:rgb(0,0,0)"> <br> </div> <div id=3D"x_Signature"> <div> <div></div> <div id=3D"x_divtagdefaultwrapper" dir=3D"ltr" style=3D"font-size:12pt; col= or:#000000; font-family:Calibri,Helvetica,sans-serif"> <p style=3D"margin-top:0px; margin-bottom:0px; margin-top:0; margin-bottom:= 0"><font size=3D"2"><span style=3D"font-size:11pt"><span style=3D"font-size= :10pt">Claire Arthur</span><br> <span style=3D"font-size:10pt">Assistant Professor, School of Music</span><= br> <span style=3D"font-size:10pt">College of Design</span><br> <span style=3D"font-size:10pt">Georgia Institute of Technology</span><br> <span style=3D"font-size:10pt">(404) 894-9110</span><br> <span style=3D"font-size:10pt">claire.arthur@xxxxxxxx</span></span></font= ></p> <p style=3D"margin-top:0px; margin-bottom:0px; margin-top:0; margin-bottom:= 0"><br> </p> </div> </div> </div> </div> </div> <div id=3D"x_appendonsend"></div> <hr tabindex=3D"-1" style=3D"display:inline-block; width:98%"> <div id=3D"x_divRplyFwdMsg" dir=3D"ltr"><font face=3D"Calibri, sans-serif" = color=3D"#000000" style=3D"font-size:11pt"><b>From:</b> AUDITORY - Research= in Auditory Perception &lt;AUDITORY@xxxxxxxx&gt; on behalf of Prof.= Roger K. Moore &lt;0000011559506d60-dmarc-request@xxxxxxxx&gt;<br> <b>Sent:</b> Friday, August 7, 2020 4:38 AM<br> <b>To:</b> AUDITORY@xxxxxxxx &lt;AUDITORY@xxxxxxxx&gt;<br> <b>Subject:</b> Re: [AUDITORY] Semantic McGurk Effect</font> <div>&nbsp;</div> </div> <div> <div dir=3D"ltr">I must admit to being surprised by the surprise engendered= by this video.&nbsp; Anyone who was around during the early days of text-t= o-speech synthesis&nbsp;is very aware of the danger of presenting the text = in advance of or simultaneous&nbsp;with the generated speech.&nbsp; The intelligibility of the resulting synthesis&nbsp;could be= zero without the 'prior' and 100% with the visual cue. <div><br> </div> <div>So, given that we know that perception involves the integration of top= -down expectations with bottom-up evidence (going right back to Richard War= ren's work on the 'phoneme restoration effect'), why is this TikTok demo su= rprising?&nbsp; Or maybe I'm missing something? <div><br> </div> <div>Best wishes</div> <div>Roger</div> <div><br clear=3D"all"> <div> <div dir=3D"ltr" class=3D"x_x_gmail_signature"> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"><font size=3D"1">-----------------------------------------= ---------------------------------------------------<br> Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET<br> <br> Chair of Spoken Language Processing<br> Vocal Interactivity Lab (VILab), Sheffield Robotics<br> Speech &amp; Hearing Research Group (SPandH)<br> Department of Computer Science, UNIVERSITY OF SHEFFIELD<br> Regent Court, 211 Portobello, Sheffield, S1 4DP, UK</font> <div><font size=3D"1" face=3D"arial, helvetica, sans-serif"><br> </font></div> <div> <div><font face=3D"arial, helvetica, sans-serif" size=3D"1">* Winner of&nbs= p;the 2016 Antonio Zampolli Prize for &quot;<i>Outstanding Contributions&nb= sp;</i></font></div> <div><font face=3D"arial, helvetica, sans-serif" size=3D"1"><i>to the Advan= cement of Language Resources &amp; Language Technology&nbsp;</i></font></di= v> <div><font face=3D"arial, helvetica, sans-serif" size=3D"1"><i>Evaluation w= ithin Human Language Technologies</i>&quot;</font></div> <font size=3D"1"><br> e-mail:&nbsp; <a href=3D"mailto:r.k.moore@xxxxxxxx" target=3D"_blank= ">r.k.moore@xxxxxxxx</a><br> web:&nbsp;<a href=3D"http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/" targ= et=3D"_blank">http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/</a></font></= div> <div><font size=3D"1">twitter: @xxxxxxxx<br> Tel: +44 (0) 11422 21807<br> Fax: +44 (0) 11422 21810<br> Mob: +44 (0) 7910 073631<br> <br> Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE<br> (<a href=3D"http://www.journals.elsevier.com/computer-speech-and-language/"= target=3D"_blank">http://www.journals.elsevier.com/computer-speech-and-lan= guage/</a>)</font></div> <div><span style=3D"font-size:x-small">------------------------------------= --------------------------------------------------------</span><br> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <a href=3D"http:///" target=3D"_blank"></a></div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <br> </div> </div> </div> <br> <div class=3D"x_x_gmail_quote"> <div dir=3D"ltr" class=3D"x_x_gmail_attr">On Fri, 7 Aug 2020 at 05:12, Malc= olm Slaney &lt;<a href=3D"mailto:malcolm@xxxxxxxx">malcolm@xxxxxxxx</a>&gt;= wrote:<br> </div> <blockquote class=3D"x_x_gmail_quote" style=3D"margin:0px 0px 0px 0.8ex; bo= rder-left-width:1px; border-left-style:solid; border-left-color:rgb(204,204= ,204); padding-left:1ex"> <div style=3D"word-wrap:break-word; line-break:after-white-space">Has there= been anything formal published on this effect? <div>&nbsp; &nbsp;<a href=3D"https://www.iflscience.com/brain/what-the-hell= -is-going-on-in-this-tiktok-audio-illusion" target=3D"_blank" style=3D"colo= r:rgb(17,85,204); font-family:Arial,Helvetica,sans-serif; font-size:small; = font-variant-ligatures:normal; background-color:rgb(255,255,255)">https://w= ww.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illu= sion</a></div> <div><br> </div> <div>It sounds to me like a semantic version of the McGurk effect.</div> <div><br> </div> <div>Nice demo.</div> <div><br> </div> <div>- Malcolm</div> <div><br> </div> </div> </blockquote> </div> </div> </div> </body> </html> --_000_SN6PR05MB5231933651DFA70185FC8665E3460SN6PR05MB5231namp_--


This message came from the mail archive
src/postings/2020/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University