[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Semantic McGurk Effect



Does anybody know the _first_ time such an effect was mentioned?  I would not be surprised if it were in the 19th century (e.g. Helmholtz) or earlier.  Possibly also in literary work or poetry.  The basic idea (perception as inference based on an internal model) dates back at least to Alhacen (Ibn al-Haytham), but I’m wondering more about the specific case of speech. 

Note that Michel Imbert very recently published a book on Ibn al-Hatham ( http://www.vrin.fr/book.php?code=9782711629343) that I hope someone will translate to English.

Alain


> On 7 Aug 2020, at 10:38, Prof. Roger K. Moore <0000011559506d60-dmarc-request@xxxxxxxxxxxxxxx> wrote:
> 
> I must admit to being surprised by the surprise engendered by this video.  Anyone who was around during the early days of text-to-speech synthesis is very aware of the danger of presenting the text in advance of or simultaneous with the generated speech.  The intelligibility of the resulting synthesis could be zero without the 'prior' and 100% with the visual cue.
> 
> So, given that we know that perception involves the integration of top-down expectations with bottom-up evidence (going right back to Richard Warren's work on the 'phoneme restoration effect'), why is this TikTok demo surprising?  Or maybe I'm missing something?
> 
> Best wishes
> Roger
> 
> --------------------------------------------------------------------------------------------
> Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET
> 
> Chair of Spoken Language Processing
> Vocal Interactivity Lab (VILab), Sheffield Robotics
> Speech & Hearing Research Group (SPandH)
> Department of Computer Science, UNIVERSITY OF SHEFFIELD
> Regent Court, 211 Portobello, Sheffield, S1 4DP, UK
> 
> * Winner of the 2016 Antonio Zampolli Prize for "Outstanding Contributions 
> to the Advancement of Language Resources & Language Technology 
> Evaluation within Human Language Technologies"
> 
> e-mail:  r.k.moore@xxxxxxxxxxxxxxx
> web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/
> twitter: @rogerkmoore
> Tel: +44 (0) 11422 21807
> Fax: +44 (0) 11422 21810
> Mob: +44 (0) 7910 073631
> 
> Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE
> (http://www.journals.elsevier.com/computer-speech-and-language/)
> --------------------------------------------------------------------------------------------
> 
> 
> On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx> wrote:
> Has there been anything formal published on this effect?
>    https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illusion
> 
> It sounds to me like a semantic version of the McGurk effect.
> 
> Nice demo.
> 
> - Malcolm
>