[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Semantic McGurk Effect



I believe because of the odd timbre/background noise, our text priming helps us choose what to parse as the "negative space". So, for example, you hear the "ee" from "needle" if you are trying to hear that word, whereas you simply ignore that "ee" as it falls in between "brain" and "storm" if you are listening for brainstorm. Fun! Thanks for sharing.

Claire




Claire Arthur
Assistant Professor, School of Music
College of Design
Georgia Institute of Technology
(404) 894-9110
claire.arthur@xxxxxxxxxx



From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx> on behalf of Prof. Roger K. Moore <0000011559506d60-dmarc-request@xxxxxxxxxxxxxxx>
Sent: Friday, August 7, 2020 4:38 AM
To: AUDITORY@xxxxxxxxxxxxxxx <AUDITORY@xxxxxxxxxxxxxxx>
Subject: Re: [AUDITORY] Semantic McGurk Effect
 
I must admit to being surprised by the surprise engendered by this video.  Anyone who was around during the early days of text-to-speech synthesis is very aware of the danger of presenting the text in advance of or simultaneous with the generated speech.  The intelligibility of the resulting synthesis could be zero without the 'prior' and 100% with the visual cue.

So, given that we know that perception involves the integration of top-down expectations with bottom-up evidence (going right back to Richard Warren's work on the 'phoneme restoration effect'), why is this TikTok demo surprising?  Or maybe I'm missing something?

Best wishes
Roger

--------------------------------------------------------------------------------------------
Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET

Chair of Spoken Language Processing
Vocal Interactivity Lab (VILab), Sheffield Robotics
Speech & Hearing Research Group (SPandH)
Department of Computer Science, UNIVERSITY OF SHEFFIELD
Regent Court, 211 Portobello, Sheffield, S1 4DP, UK

* Winner of the 2016 Antonio Zampolli Prize for "Outstanding Contributions 
to the Advancement of Language Resources & Language Technology 
Evaluation within Human Language Technologies"

e-mail:  r.k.moore@xxxxxxxxxxxxxxx
web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/
twitter: @rogerkmoore
Tel: +44 (0) 11422 21807
Fax: +44 (0) 11422 21810
Mob: +44 (0) 7910 073631

Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE
(http://www.journals.elsevier.com/computer-speech-and-language/)
--------------------------------------------------------------------------------------------


On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx> wrote:
Has there been anything formal published on this effect?

It sounds to me like a semantic version of the McGurk effect.

Nice demo.

- Malcolm