Re: [AUDITORY] Semantic McGurk Effect (Prof. Roger K. Moore)


Subject: Re: [AUDITORY] Semantic McGurk Effect
From:    Prof. Roger K. Moore <"Prof. Roger K. Moore">
Date:    Fri, 7 Aug 2020 09:38:29 +0100

--0000000000003e90ba05ac458847 Content-Type: text/plain; charset="UTF-8" I must admit to being surprised by the surprise engendered by this video. Anyone who was around during the early days of text-to-speech synthesis is very aware of the danger of presenting the text in advance of or simultaneous with the generated speech. The intelligibility of the resulting synthesis could be zero without the 'prior' and 100% with the visual cue. So, given that we know that perception involves the integration of top-down expectations with bottom-up evidence (going right back to Richard Warren's work on the 'phoneme restoration effect'), why is this TikTok demo surprising? Or maybe I'm missing something? Best wishes Roger -------------------------------------------------------------------------------------------- Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET Chair of Spoken Language Processing Vocal Interactivity Lab (VILab), Sheffield Robotics Speech & Hearing Research Group (SPandH) Department of Computer Science, UNIVERSITY OF SHEFFIELD Regent Court, 211 Portobello, Sheffield, S1 4DP, UK * Winner of the 2016 Antonio Zampolli Prize for "*Outstanding Contributions * *to the Advancement of Language Resources & Language Technology * *Evaluation within Human Language Technologies*" e-mail: r.k.moore@xxxxxxxx web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/ twitter: @xxxxxxxx Tel: +44 (0) 11422 21807 Fax: +44 (0) 11422 21810 Mob: +44 (0) 7910 073631 Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE (http://www.journals.elsevier.com/computer-speech-and-language/) -------------------------------------------------------------------------------------------- On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx> wrote: > Has there been anything formal published on this effect? > > https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illusion > > It sounds to me like a semantic version of the McGurk effect. > > Nice demo. > > - Malcolm > > --0000000000003e90ba05ac458847 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">I must admit to being surprised by the surprise engendered= by this video.=C2=A0 Anyone who was around during the early days of text-t= o-speech synthesis=C2=A0is very aware of the danger of presenting the text = in advance of or simultaneous=C2=A0with the generated speech.=C2=A0 The int= elligibility of the resulting synthesis=C2=A0could be zero without the &#39= ;prior&#39; and 100% with the visual cue.<div><br></div><div>So, given that= we know that perception involves the integration of top-down expectations = with bottom-up evidence (going right back to Richard Warren&#39;s work on t= he &#39;phoneme restoration effect&#39;), why is this TikTok demo surprisin= g?=C2=A0 Or maybe I&#39;m missing something?<div><br></div><div>Best wishes= </div><div>Roger</div><div><br clear=3D"all"><div><div dir=3D"ltr" class=3D= "gmail_signature" data-smartmail=3D"gmail_signature"><div dir=3D"ltr"><div>= <div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir= =3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr">= <div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div= dir=3D"ltr"><font size=3D"1">---------------------------------------------= -----------------------------------------------<br>Prof ROGER K MOORE* BA(H= ons) MSc PhD FIOA FISCA MIET<br><br>Chair of Spoken Language Processing<br>= Vocal Interactivity Lab (VILab), Sheffield Robotics<br>Speech &amp; Hearing= Research Group (SPandH)<br>Department of Computer Science, UNIVERSITY OF S= HEFFIELD<br>Regent Court, 211 Portobello, Sheffield, S1 4DP, UK</font><div>= <font size=3D"1" face=3D"arial, helvetica, sans-serif"><br></font></div><di= v><div><font face=3D"arial, helvetica, sans-serif" size=3D"1">* Winner of= =C2=A0the 2016 Antonio Zampolli Prize for &quot;<i>Outstanding Contribution= s=C2=A0</i></font></div><div><font face=3D"arial, helvetica, sans-serif" si= ze=3D"1"><i>to the Advancement of Language Resources &amp; Language Technol= ogy=C2=A0</i></font></div><div><font face=3D"arial, helvetica, sans-serif" = size=3D"1"><i>Evaluation within Human Language Technologies</i>&quot;</font= ></div><font size=3D"1"><br>e-mail:=C2=A0 <a href=3D"mailto:r.k.moore@xxxxxxxx= ield.ac.uk" target=3D"_blank">r.k.moore@xxxxxxxx</a><br>web:=C2=A0<a= href=3D"http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/" target=3D"_blank= ">http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/</a></font></div><div><fo= nt size=3D"1">twitter: @xxxxxxxx<br>Tel: +44 (0) 11422 21807<br>Fax: +44= (0) 11422 21810<br>Mob: +44 (0) 7910 073631<br><br>Editor-in-Chief: COMPUT= ER SPEECH AND LANGUAGE<br>(<a href=3D"http://www.journals.elsevier.com/comp= uter-speech-and-language/" target=3D"_blank">http://www.journals.elsevier.c= om/computer-speech-and-language/</a>)</font></div><div><span style=3D"font-= size:x-small">-------------------------------------------------------------= -------------------------------</span><br></div></div></div></div></div></d= iv></div></div></div></div></div><a href=3D"http:///" target=3D"_blank"></a= ></div></div></div></div></div></div></div></div></div></div></div></div></= div></div></div><br></div></div></div><br><div class=3D"gmail_quote"><div d= ir=3D"ltr" class=3D"gmail_attr">On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney= &lt;<a href=3D"mailto:malcolm@xxxxxxxx">malcolm@xxxxxxxx</a>&gt; wrote:<br= ></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;= border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204= ,204);padding-left:1ex"><div style=3D"word-wrap:break-word;line-break:after= -white-space">Has there been anything formal published on this effect?<div>= =C2=A0 =C2=A0<a href=3D"https://www.iflscience.com/brain/what-the-hell-is-g= oing-on-in-this-tiktok-audio-illusion" style=3D"color:rgb(17,85,204);font-f= amily:Arial,Helvetica,sans-serif;font-size:small;font-variant-ligatures:nor= mal;background-color:rgb(255,255,255)" target=3D"_blank">https://www.iflsci= ence.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illusion</a><= /div><div><br></div><div>It sounds to me like a semantic version of the McG= urk effect.</div><div><br></div><div>Nice demo.</div><div><br></div><div>- = Malcolm</div><div><br></div></div></blockquote></div> --0000000000003e90ba05ac458847--


This message came from the mail archive
src/postings/2020/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University