Re: [AUDITORY] Semantic McGurk Effect (Nathaniel Zuk )


Subject: Re: [AUDITORY] Semantic McGurk Effect
From:    Nathaniel Zuk  <ZUKN@xxxxxxxx>
Date:    Sat, 8 Aug 2020 09:57:14 +0000

--_000_DB6PR0201MB2421D0E884B40D95449EF057A1460DB6PR0201MB2421_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable (Today I learned) It's called a mondegreen. https://www.newyorker.com/science/maria-konnikova/science-misheard-lyrics-m= ondegreens Nate ________________________________ From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxx>= on behalf of Prof. Roger K. Moore <0000011559506d60-dmarc-request@xxxxxxxx= GILL.CA> Sent: Friday, August 7, 2020 3:40 PM To: AUDITORY@xxxxxxxx <AUDITORY@xxxxxxxx> Subject: Re: [AUDITORY] Semantic McGurk Effect Great example Julia - and, of course, comedians such as Bec Hill (https://y= outu.be/nEdR44Iftb4) and Peter Kay (https://youtu.be/7my5baoCVv8) trade on = this effect - R <http:///> On Fri, 7 Aug 2020 at 14:42, Julia Strand <jstrand@xxxxxxxx<mailto:jstr= and@xxxxxxxx>> wrote: I'm always delighted when auditory phenomena spark the public's interest! I wouldn't call this a semantic McGurk, given that it doesn't have to be dr= iven by simultaneous bottom-up input from two modalities. That is, even if = nothing is written on the screen but you're just thinking "green needle" to= yourself, that's what you're likely to hear (whereas thinking "ga" while h= earing "ba" won't get you to "da" - you need the simultaneous input from fa= ce and voice). So I'd agree with Roger that it's more akin to the phoneme r= estoration effect or work like Cynthia Connine's "she ran hot water for the= p/bath," showing how expectations influence interpretation of bottom-up in= put. I think most of US wouldn't be surprised that the same stimulus can be perc= eived in different ways, but my impression is that the general public tends= to believe "what you see is what you get" and underestimates the power of = top-down influences. Same reason #TheDress was such a hit. When I include this in my class on speech perception, I also include this v= ideo which shows Grover from Sesame street<https://languagelog.ldc.upenn.ed= u/nll/?p=3D41249> saying EITHER "Yes, yes, that sounds like an excellent id= ea!" OR "Yes, yes, that's a f*%#g excellent idea!" Like I'm always telling my students - Speech is hard! Context helps! Best, Julia On Fri, Aug 7, 2020 at 4:28 AM Prof. Roger K. Moore <0000011559506d60-dmarc= -request@xxxxxxxx<mailto:0000011559506d60-dmarc-request@xxxxxxxx= .ca>> wrote: I must admit to being surprised by the surprise engendered by this video. = Anyone who was around during the early days of text-to-speech synthesis is = very aware of the danger of presenting the text in advance of or simultaneo= us with the generated speech. The intelligibility of the resulting synthes= is could be zero without the 'prior' and 100% with the visual cue. So, given that we know that perception involves the integration of top-down= expectations with bottom-up evidence (going right back to Richard Warren's= work on the 'phoneme restoration effect'), why is this TikTok demo surpris= ing? Or maybe I'm missing something? Best wishes Roger ---------------------------------------------------------------------------= ----------------- Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET Chair of Spoken Language Processing Vocal Interactivity Lab (VILab), Sheffield Robotics Speech & Hearing Research Group (SPandH) Department of Computer Science, UNIVERSITY OF SHEFFIELD Regent Court, 211 Portobello, Sheffield, S1 4DP, UK * Winner of the 2016 Antonio Zampolli Prize for "Outstanding Contributions to the Advancement of Language Resources & Language Technology Evaluation within Human Language Technologies" e-mail: r.k.moore@xxxxxxxx<mailto:r.k.moore@xxxxxxxx> web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/ twitter: @xxxxxxxx Tel: +44 (0) 11422 21807 Fax: +44 (0) 11422 21810 Mob: +44 (0) 7910 073631 Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE (http://www.journals.elsevier.com/computer-speech-and-language/) ---------------------------------------------------------------------------= ----------------- <http:///> On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx<mailto:malcol= m@xxxxxxxx>> wrote: Has there been anything formal published on this effect? https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tikto= k-audio-illusion It sounds to me like a semantic version of the McGurk effect. Nice demo. - Malcolm -- Julia Strand, PhD Assistant Professor of Psychology Carleton College One North College Street Northfield, MN 55057 507-222-5637 Website<https://apps.carleton.edu/curricular/psyc/jstrand/> Make an appointment<http://juliastrand.youcanbook.me> --_000_DB6PR0201MB2421D0E884B40D95449EF057A1460DB6PR0201MB2421_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"= > <style type=3D"text/css" style=3D"display:none;"> P {margin-top:0;margin-bo= ttom:0;} </style> </head> <body dir=3D"ltr"> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);"> (Today I learned) It's called a mondegreen.</div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);"> <a href=3D"https://www.newyorker.com/science/maria-konnikova/science-mishea= rd-lyrics-mondegreens">https://www.newyorker.com/science/maria-konnikova/sc= ience-misheard-lyrics-mondegreens</a><br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);"> <br> </div> <div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size= : 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);"> Nate</div> <div id=3D"appendonsend"></div> <hr style=3D"display:inline-block;width:98%" tabindex=3D"-1"> <div id=3D"divRplyFwdMsg" dir=3D"ltr"><font face=3D"Calibri, sans-serif" st= yle=3D"font-size:11pt" color=3D"#000000"><b>From:</b> AUDITORY - Research i= n Auditory Perception &lt;AUDITORY@xxxxxxxx&gt; on behalf of Prof. R= oger K. Moore &lt;0000011559506d60-dmarc-request@xxxxxxxx&gt;<br> <b>Sent:</b> Friday, August 7, 2020 3:40 PM<br> <b>To:</b> AUDITORY@xxxxxxxx &lt;AUDITORY@xxxxxxxx&gt;<br> <b>Subject:</b> Re: [AUDITORY] Semantic McGurk Effect</font> <div>&nbsp;</div> </div> <div> <div dir=3D"ltr"> <div dir=3D"ltr"> <div dir=3D"ltr">Great example Julia - and, of course, comedians such as Be= c Hill (<a href=3D"https://youtu.be/nEdR44Iftb4">https://youtu.be/nEdR44Ift= b4</a>) and Peter Kay (<a href=3D"https://youtu.be/7my5baoCVv8">https://you= tu.be/7my5baoCVv8</a>) trade on this effect - R<br clear=3D"all"> <div> <div dir=3D"ltr" class=3D"x_gmail_signature"> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div><br> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <a href=3D"http:///" target=3D"_blank"></a></div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <br> </div> </div> </div> <br> <div class=3D"x_gmail_quote"> <div dir=3D"ltr" class=3D"x_gmail_attr">On Fri, 7 Aug 2020 at 14:42, Julia = Strand &lt;<a href=3D"mailto:jstrand@xxxxxxxx">jstrand@xxxxxxxx</a>= &gt; wrote:<br> </div> <blockquote class=3D"x_gmail_quote" style=3D"margin:0px 0px 0px 0.8ex; bord= er-left-width:1px; border-left-style:solid; border-left-color:rgb(204,204,2= 04); padding-left:1ex"> <div dir=3D"ltr">I'm always delighted when auditory phenomena&nbsp;spark th= e public's interest!&nbsp; <div><br> </div> <div>I wouldn't call this a semantic McGurk, given that it doesn't have to = be driven by simultaneous bottom-up input from two modalities. That is, eve= n if nothing is written on the screen but you're just thinking &quot;green = needle&quot; to yourself, that's what you're likely to hear (whereas thinking &quot;ga&quot; while hearing &quot;ba&quo= t; won't get you to &quot;da&quot; - you need the simultaneous input from f= ace and voice). So I'd agree with Roger that it's more akin to the phoneme = restoration effect or work like Cynthia Connine's &quot;she ran hot water for the p/bath,&quot; showing how expectations influence interpretat= ion of bottom-up input. <div><br> </div> <div>I think most of US wouldn't be surprised that the same stimulus can be= perceived in different ways, but my impression is that the general public = tends to believe &quot;what you see is what you get&quot; and underestimate= s the power of top-down influences. Same reason #TheDress was such a hit.&nbsp;</div> <div><br> </div> <div>When I include this in my class on speech perception, I also include t= his <a href=3D"https://languagelog.ldc.upenn.edu/nll/?p=3D41249" target=3D"= _blank"> video which shows Grover from Sesame street</a> saying EITHER &quot;Yes, ye= s, that sounds like an excellent idea!&quot; OR &quot;Yes, yes, that's a f*= %#g excellent idea!&quot;</div> <div><br> </div> <div>Like I'm always telling my students - Speech is hard! Context helps!</= div> <div><br> </div> <div>Best,</div> <div>Julia</div> </div> </div> <br> <div class=3D"x_gmail_quote"> <div dir=3D"ltr" class=3D"x_gmail_attr">On Fri, Aug 7, 2020 at 4:28 AM Prof= . Roger K. Moore &lt;<a href=3D"mailto:0000011559506d60-dmarc-request@xxxxxxxx= .mcgill.ca" target=3D"_blank">0000011559506d60-dmarc-request@xxxxxxxx= a</a>&gt; wrote:<br> </div> <blockquote class=3D"x_gmail_quote" style=3D"margin:0px 0px 0px 0.8ex; bord= er-left-width:1px; border-left-style:solid; border-left-color:rgb(204,204,2= 04); padding-left:1ex"> <div dir=3D"ltr">I must admit to being surprised by the surprise engendered= by this video.&nbsp; Anyone who was around during the early days of text-t= o-speech synthesis&nbsp;is very aware of the danger of presenting the text = in advance of or simultaneous&nbsp;with the generated speech.&nbsp; The intelligibility of the resulting synthesis&nbsp;could be= zero without the 'prior' and 100% with the visual cue. <div><br> </div> <div>So, given that we know that perception involves the integration of top= -down expectations with bottom-up evidence (going right back to Richard War= ren's work on the 'phoneme restoration effect'), why is this TikTok demo su= rprising?&nbsp; Or maybe I'm missing something? <div><br> </div> <div>Best wishes</div> <div>Roger</div> <div><br clear=3D"all"> <div> <div dir=3D"ltr"> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"><font size=3D"1">-----------------------------------------= ---------------------------------------------------<br> Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET<br> <br> Chair of Spoken Language Processing<br> Vocal Interactivity Lab (VILab), Sheffield Robotics<br> Speech &amp; Hearing Research Group (SPandH)<br> Department of Computer Science, UNIVERSITY OF SHEFFIELD<br> Regent Court, 211 Portobello, Sheffield, S1 4DP, UK</font> <div><font size=3D"1" face=3D"arial, helvetica, sans-serif"><br> </font></div> <div> <div><font face=3D"arial, helvetica, sans-serif" size=3D"1">* Winner of&nbs= p;the 2016 Antonio Zampolli Prize for &quot;<i>Outstanding Contributions&nb= sp;</i></font></div> <div><font face=3D"arial, helvetica, sans-serif" size=3D"1"><i>to the Advan= cement of Language Resources &amp; Language Technology&nbsp;</i></font></di= v> <div><font face=3D"arial, helvetica, sans-serif" size=3D"1"><i>Evaluation w= ithin Human Language Technologies</i>&quot;</font></div> <font size=3D"1"><br> e-mail:&nbsp; <a href=3D"mailto:r.k.moore@xxxxxxxx" target=3D"_blank= ">r.k.moore@xxxxxxxx</a><br> web:&nbsp;<a href=3D"http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/" targ= et=3D"_blank">http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/</a></font></= div> <div><font size=3D"1">twitter: @xxxxxxxx<br> Tel: +44 (0) 11422 21807<br> Fax: +44 (0) 11422 21810<br> Mob: +44 (0) 7910 073631<br> <br> Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE<br> (<a href=3D"http://www.journals.elsevier.com/computer-speech-and-language/"= target=3D"_blank">http://www.journals.elsevier.com/computer-speech-and-lan= guage/</a>)</font></div> <div><span style=3D"font-size:x-small">------------------------------------= --------------------------------------------------------</span><br> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <a href=3D"http:///" target=3D"_blank"></a></div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <br> </div> </div> </div> <br> <div class=3D"x_gmail_quote"> <div dir=3D"ltr" class=3D"x_gmail_attr">On Fri, 7 Aug 2020 at 05:12, Malcol= m Slaney &lt;<a href=3D"mailto:malcolm@xxxxxxxx" target=3D"_blank">malcolm@xxxxxxxx= ieee.org</a>&gt; wrote:<br> </div> <blockquote class=3D"x_gmail_quote" style=3D"margin:0px 0px 0px 0.8ex; bord= er-left-width:1px; border-left-style:solid; border-left-color:rgb(204,204,2= 04); padding-left:1ex"> <div>Has there been anything formal published on this effect? <div>&nbsp; &nbsp;<a href=3D"https://www.iflscience.com/brain/what-the-hell= -is-going-on-in-this-tiktok-audio-illusion" target=3D"_blank" style=3D"colo= r:rgb(17,85,204); font-family:Arial,Helvetica,sans-serif; font-size:small; = font-variant-ligatures:normal; background-color:rgb(255,255,255)">https://w= ww.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illu= sion</a></div> <div><br> </div> <div>It sounds to me like a semantic version of the McGurk effect.</div> <div><br> </div> <div>Nice demo.</div> <div><br> </div> <div>- Malcolm</div> <div><br> </div> </div> </blockquote> </div> </blockquote> </div> <br clear=3D"all"> <div><br> </div> -- <br> <div dir=3D"ltr"> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div> <div dir=3D"ltr"> <div dir=3D"ltr">Julia Strand, PhD <div>Assistant Professor of Psychology</div> <div>Carleton College</div> <div>One North College Street</div> <div>Northfield, MN 55057</div> <div>507-222-5637</div> <div><a href=3D"https://apps.carleton.edu/curricular/psyc/jstrand/" target= =3D"_blank">Website</a></div> <div><a href=3D"http://juliastrand.youcanbook.me" target=3D"_blank">Make an= appointment</a></div> </div> </div> </div> </div> </div> </div> </div> </blockquote> </div> </div> </body> </html> --_000_DB6PR0201MB2421D0E884B40D95449EF057A1460DB6PR0201MB2421_--


This message came from the mail archive
src/postings/2020/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University