Re: [AUDITORY] Semantic McGurk Effect (Prof. Roger K. Moore)


Subject: Re: [AUDITORY] Semantic McGurk Effect
From:    Prof. Roger K. Moore <"Prof. Roger K. Moore">
Date:    Fri, 7 Aug 2020 15:40:00 +0100

--0000000000000c316c05ac4a9562 Content-Type: text/plain; charset="UTF-8" Great example Julia - and, of course, comedians such as Bec Hill ( https://youtu.be/nEdR44Iftb4) and Peter Kay (https://youtu.be/7my5baoCVv8) trade on this effect - R On Fri, 7 Aug 2020 at 14:42, Julia Strand <jstrand@xxxxxxxx> wrote: > I'm always delighted when auditory phenomena spark the public's interest! > > I wouldn't call this a semantic McGurk, given that it doesn't have to be > driven by simultaneous bottom-up input from two modalities. That is, even > if nothing is written on the screen but you're just thinking "green needle" > to yourself, that's what you're likely to hear (whereas thinking "ga" while > hearing "ba" won't get you to "da" - you need the simultaneous input from > face and voice). So I'd agree with Roger that it's more akin to the phoneme > restoration effect or work like Cynthia Connine's "she ran hot water for > the p/bath," showing how expectations influence interpretation of bottom-up > input. > > I think most of US wouldn't be surprised that the same stimulus can be > perceived in different ways, but my impression is that the general public > tends to believe "what you see is what you get" and underestimates the > power of top-down influences. Same reason #TheDress was such a hit. > > When I include this in my class on speech perception, I also include this video > which shows Grover from Sesame street > <https://languagelog.ldc.upenn.edu/nll/?p=41249> saying EITHER "Yes, yes, > that sounds like an excellent idea!" OR "Yes, yes, that's a f*%#g excellent > idea!" > > Like I'm always telling my students - Speech is hard! Context helps! > > Best, > Julia > > On Fri, Aug 7, 2020 at 4:28 AM Prof. Roger K. Moore < > 0000011559506d60-dmarc-request@xxxxxxxx> wrote: > >> I must admit to being surprised by the surprise engendered by this >> video. Anyone who was around during the early days of text-to-speech >> synthesis is very aware of the danger of presenting the text in advance of >> or simultaneous with the generated speech. The intelligibility of the >> resulting synthesis could be zero without the 'prior' and 100% with the >> visual cue. >> >> So, given that we know that perception involves the integration of >> top-down expectations with bottom-up evidence (going right back to Richard >> Warren's work on the 'phoneme restoration effect'), why is this TikTok demo >> surprising? Or maybe I'm missing something? >> >> Best wishes >> Roger >> >> >> -------------------------------------------------------------------------------------------- >> Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET >> >> Chair of Spoken Language Processing >> Vocal Interactivity Lab (VILab), Sheffield Robotics >> Speech & Hearing Research Group (SPandH) >> Department of Computer Science, UNIVERSITY OF SHEFFIELD >> Regent Court, 211 Portobello, Sheffield, S1 4DP, UK >> >> * Winner of the 2016 Antonio Zampolli Prize for "*Outstanding >> Contributions * >> *to the Advancement of Language Resources & Language Technology * >> *Evaluation within Human Language Technologies*" >> >> e-mail: r.k.moore@xxxxxxxx >> web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/ >> twitter: @xxxxxxxx >> Tel: +44 (0) 11422 21807 >> Fax: +44 (0) 11422 21810 >> Mob: +44 (0) 7910 073631 >> >> Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE >> (http://www.journals.elsevier.com/computer-speech-and-language/) >> >> -------------------------------------------------------------------------------------------- >> >> >> On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx> wrote: >> >>> Has there been anything formal published on this effect? >>> >>> https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illusion >>> >>> It sounds to me like a semantic version of the McGurk effect. >>> >>> Nice demo. >>> >>> - Malcolm >>> >>> > > -- > Julia Strand, PhD > Assistant Professor of Psychology > Carleton College > One North College Street > Northfield, MN 55057 > 507-222-5637 > Website <https://apps.carleton.edu/curricular/psyc/jstrand/> > Make an appointment <http://juliastrand.youcanbook.me> > --0000000000000c316c05ac4a9562 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Great example Julia - an= d, of course, comedians such as Bec Hill (<a href=3D"https://youtu.be/nEdR4= 4Iftb4">https://youtu.be/nEdR44Iftb4</a>) and Peter Kay (<a href=3D"https:/= /youtu.be/7my5baoCVv8">https://youtu.be/7my5baoCVv8</a>) trade on this effe= ct - R<br clear=3D"all"><div><div dir=3D"ltr" class=3D"gmail_signature"><di= v dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"= ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div= ><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir= =3D"ltr"><div><div dir=3D"ltr"><div><br></div></div></div></div></div></div= ></div></div></div></div></div><a href=3D"http:///" target=3D"_blank"></a><= /div></div></div></div></div></div></div></div></div></div></div></div></di= v></div></div><br></div></div></div><br><div class=3D"gmail_quote"><div dir= =3D"ltr" class=3D"gmail_attr">On Fri, 7 Aug 2020 at 14:42, Julia Strand &lt= ;<a href=3D"mailto:jstrand@xxxxxxxx">jstrand@xxxxxxxx</a>&gt; wrote= :<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.= 8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204= ,204,204);padding-left:1ex"><div dir=3D"ltr">I&#39;m always delighted when = auditory phenomena=C2=A0spark the public&#39;s interest!=C2=A0<div><br></di= v><div>I wouldn&#39;t call this a semantic McGurk, given that it doesn&#39;= t have to be driven by simultaneous bottom-up input from two modalities. Th= at is, even if nothing is written on the screen but you&#39;re just thinkin= g &quot;green needle&quot; to yourself, that&#39;s what you&#39;re likely t= o hear (whereas thinking &quot;ga&quot; while hearing &quot;ba&quot; won&#3= 9;t get you to &quot;da&quot; - you need the simultaneous input from face a= nd voice). So I&#39;d agree with Roger that it&#39;s more akin to the phone= me restoration effect or work like Cynthia Connine&#39;s &quot;she ran hot = water for the p/bath,&quot; showing how expectations influence interpretati= on of bottom-up input.<div><br></div><div>I think most of US wouldn&#39;t b= e surprised that the same stimulus can be perceived in different ways, but = my impression is that the general public tends to believe &quot;what you se= e is what you get&quot; and underestimates the power of top-down influences= . Same reason #TheDress was such a hit.=C2=A0</div><div><br></div><div>When= I include this in my class on speech perception, I also include this <a hr= ef=3D"https://languagelog.ldc.upenn.edu/nll/?p=3D41249" target=3D"_blank">v= ideo which shows Grover from Sesame street</a> saying EITHER &quot;Yes, yes= , that sounds like an excellent idea!&quot; OR &quot;Yes, yes, that&#39;s a= f*%#g excellent idea!&quot;</div><div><br></div><div>Like I&#39;m always t= elling my students - Speech is hard! Context helps!</div><div><br></div><di= v>Best,</div><div>Julia</div></div></div><br><div class=3D"gmail_quote"><di= v dir=3D"ltr" class=3D"gmail_attr">On Fri, Aug 7, 2020 at 4:28 AM Prof. Rog= er K. Moore &lt;<a href=3D"mailto:0000011559506d60-dmarc-request@xxxxxxxx= ll.ca" target=3D"_blank">0000011559506d60-dmarc-request@xxxxxxxx</a>= &gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px = 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-col= or:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">I must admit to bein= g surprised by the surprise engendered by this video.=C2=A0 Anyone who was = around during the early days of text-to-speech synthesis=C2=A0is very aware= of the danger of presenting the text in advance of or simultaneous=C2=A0wi= th the generated speech.=C2=A0 The intelligibility of the resulting synthes= is=C2=A0could be zero without the &#39;prior&#39; and 100% with the visual = cue.<div><br></div><div>So, given that we know that perception involves the= integration of top-down expectations with bottom-up evidence (going right = back to Richard Warren&#39;s work on the &#39;phoneme restoration effect&#3= 9;), why is this TikTok demo surprising?=C2=A0 Or maybe I&#39;m missing som= ething?<div><br></div><div>Best wishes</div><div>Roger</div><div><br clear= =3D"all"><div><div dir=3D"ltr"><div dir=3D"ltr"><div><div dir=3D"ltr"><div>= <div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir= =3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr">= <div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><font siz= e=3D"1">-------------------------------------------------------------------= -------------------------<br>Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISC= A MIET<br><br>Chair of Spoken Language Processing<br>Vocal Interactivity La= b (VILab), Sheffield Robotics<br>Speech &amp; Hearing Research Group (SPand= H)<br>Department of Computer Science, UNIVERSITY OF SHEFFIELD<br>Regent Cou= rt, 211 Portobello, Sheffield, S1 4DP, UK</font><div><font size=3D"1" face= =3D"arial, helvetica, sans-serif"><br></font></div><div><div><font face=3D"= arial, helvetica, sans-serif" size=3D"1">* Winner of=C2=A0the 2016 Antonio = Zampolli Prize for &quot;<i>Outstanding Contributions=C2=A0</i></font></div= ><div><font face=3D"arial, helvetica, sans-serif" size=3D"1"><i>to the Adva= ncement of Language Resources &amp; Language Technology=C2=A0</i></font></d= iv><div><font face=3D"arial, helvetica, sans-serif" size=3D"1"><i>Evaluatio= n within Human Language Technologies</i>&quot;</font></div><font size=3D"1"= ><br>e-mail:=C2=A0 <a href=3D"mailto:r.k.moore@xxxxxxxx" target=3D"_= blank">r.k.moore@xxxxxxxx</a><br>web:=C2=A0<a href=3D"http://staffww= w.dcs.shef.ac.uk/people/R.K.Moore/" target=3D"_blank">http://staffwww.dcs.s= hef.ac.uk/people/R.K.Moore/</a></font></div><div><font size=3D"1">twitter: = @xxxxxxxx<br>Tel: +44 (0) 11422 21807<br>Fax: +44 (0) 11422 21810<br>Mob= : +44 (0) 7910 073631<br><br>Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE<= br>(<a href=3D"http://www.journals.elsevier.com/computer-speech-and-languag= e/" target=3D"_blank">http://www.journals.elsevier.com/computer-speech-and-= language/</a>)</font></div><div><span style=3D"font-size:x-small">---------= ---------------------------------------------------------------------------= --------</span><br></div></div></div></div></div></div></div></div></div></= div></div><a href=3D"http:///" target=3D"_blank"></a></div></div></div></di= v></div></div></div></div></div></div></div></div></div></div></div><br></d= iv></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gma= il_attr">On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney &lt;<a href=3D"mailto:= malcolm@xxxxxxxx" target=3D"_blank">malcolm@xxxxxxxx</a>&gt; wrote:<br></di= v><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;borde= r-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204)= ;padding-left:1ex"><div>Has there been anything formal published on this ef= fect?<div>=C2=A0 =C2=A0<a href=3D"https://www.iflscience.com/brain/what-the= -hell-is-going-on-in-this-tiktok-audio-illusion" style=3D"color:rgb(17,85,2= 04);font-family:Arial,Helvetica,sans-serif;font-size:small;font-variant-lig= atures:normal;background-color:rgb(255,255,255)" target=3D"_blank">https://= www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-ill= usion</a></div><div><br></div><div>It sounds to me like a semantic version = of the McGurk effect.</div><div><br></div><div>Nice demo.</div><div><br></d= iv><div>- Malcolm</div><div><br></div></div></blockquote></div> </blockquote></div><br clear=3D"all"><div><br></div>-- <br><div dir=3D"ltr"= ><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div dir=3D"l= tr">Julia Strand, PhD<div>Assistant Professor of Psychology</div><div>Carle= ton College</div><div>One North College Street</div><div>Northfield, MN 550= 57</div><div>507-222-5637</div><div><a href=3D"https://apps.carleton.edu/cu= rricular/psyc/jstrand/" target=3D"_blank">Website</a></div><div><a href=3D"= http://juliastrand.youcanbook.me" target=3D"_blank">Make an appointment</a>= </div></div></div></div></div></div></div></div> </blockquote></div> --0000000000000c316c05ac4a9562--


This message came from the mail archive
src/postings/2020/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University