[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Semantic McGurk Effect

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: [AUDITORY] Semantic McGurk Effect
From: Alain de Cheveigne <alain.de.cheveigne@xxxxxx>
Date: Fri, 7 Aug 2020 11:41:30 +0200
Approved-by: alain.de.cheveigne@xxxxxx
Arc-authentication-results: i=1; mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.102 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx
Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-owner:list-subscribe:list-unsubscribe:list-help :precedence:in-reply-to:to:comments:subject:from:sender:reply-to :date:message-id:references:content-transfer-encoding:mime-version :approved-by; bh=DpjvwzT2Yuj53uUsVPXbMtyAQ7LcHHFiAQD0MFzF6eM=; b=N/BWPVKZvoefWqeOPJ8QCUNIOReBKmRuv4oVc/ge7ENlx+95OPz6pZqsIe16fiq650 2wGTwmQnTSG3DhTUlKrTZu3w1zBUhtheBTaSkebABWxosj5SbRCqRKcu4szl3TgIWYUZ UqQBjjWV+vfTRjG4U89INppAaciVKYHMHcObV6XqGzd6o1aUAhCSepNFFJ0lwCZNRlag cv67zA5Fqx3fzQx5o8t2zYVAet/GweY4CRn1vKaJdRw9yRfdwBo/exPQkPOQ/x8T3t84 dzZpDru4GS3vtYeaI+SPBjSpI/BYuowgmzoaARnn08pHTD0Y89OGoGM7A7IdvUfjdB+A Mv4w==
Arc-seal: i=1; a=rsa-sha256; t=1596859786; cv=none; d=google.com; s=arc-20160816; b=NP6qIU60HGX0vrKNWlTVcsXya9OjZrHt14eBZIUpbOHB9ZgcxWs8hxR2n35jC5+XsT F3FLoyvDYUqu4ueey9rz7K46KUYITzBgTvBeynXVtTd4gOQB1E52N2BWJSyAH8NmqRv8 pVDGX0l6+bzz74+mYhRwVEUEbpwzfdaI42nqp2LoK2NNibY2/4jXPESvsxgivEmgDihY FRua+v7OhFAmBNfgeNuo1ZJIDENC53yKhyNeyU8STiEePzp7uOh/3VUg7ey5P/DMZlra G45yx2DvsMkY8AzTI1ScgspciYY3wETBXUxDLYEqK+69HIZzh4kj6h7vzzmOMVnnJEjP t1EQ==
Authentication-results: mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.102 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx
Comments: To: "Prof. Roger K. Moore" <r.k.moore@xxxxxxxxxxxxxxx>
Delivered-to: dan.ellis@xxxxxxxxx
In-reply-to: <CAKtBpfDGScf6KKDNdV=g+cgpQ65ABPdmWTtz4qBEEDrgJKto1A@mail.gmail.com>
List-archive: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>
List-help: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>
List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>
List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>
List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>
References: <F9FC5620-D6CA-49E3-B30F-0E9F0541F3D3@ieee.org> <CAKtBpfDGScf6KKDNdV=g+cgpQ65ABPdmWTtz4qBEEDrgJKto1A@mail.gmail.com>
Reply-to: Alain de Cheveigne <alain.de.cheveigne@xxxxxx>
Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

Does anybody know the _first_ time such an effect was mentioned?  I would not be surprised if it were in the 19th century (e.g. Helmholtz) or earlier.  Possibly also in literary work or poetry.  The basic idea (perception as inference based on an internal model) dates back at least to Alhacen (Ibn al-Haytham), but I’m wondering more about the specific case of speech. 

Note that Michel Imbert very recently published a book on Ibn al-Hatham ( http://www.vrin.fr/book.php?code=9782711629343) that I hope someone will translate to English.

Alain


> On 7 Aug 2020, at 10:38, Prof. Roger K. Moore <0000011559506d60-dmarc-request@xxxxxxxxxxxxxxx> wrote:
> 
> I must admit to being surprised by the surprise engendered by this video.  Anyone who was around during the early days of text-to-speech synthesis is very aware of the danger of presenting the text in advance of or simultaneous with the generated speech.  The intelligibility of the resulting synthesis could be zero without the 'prior' and 100% with the visual cue.
> 
> So, given that we know that perception involves the integration of top-down expectations with bottom-up evidence (going right back to Richard Warren's work on the 'phoneme restoration effect'), why is this TikTok demo surprising?  Or maybe I'm missing something?
> 
> Best wishes
> Roger
> 
> --------------------------------------------------------------------------------------------
> Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET
> 
> Chair of Spoken Language Processing
> Vocal Interactivity Lab (VILab), Sheffield Robotics
> Speech & Hearing Research Group (SPandH)
> Department of Computer Science, UNIVERSITY OF SHEFFIELD
> Regent Court, 211 Portobello, Sheffield, S1 4DP, UK
> 
> * Winner of the 2016 Antonio Zampolli Prize for "Outstanding Contributions 
> to the Advancement of Language Resources & Language Technology 
> Evaluation within Human Language Technologies"
> 
> e-mail:  r.k.moore@xxxxxxxxxxxxxxx
> web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/
> twitter: @rogerkmoore
> Tel: +44 (0) 11422 21807
> Fax: +44 (0) 11422 21810
> Mob: +44 (0) 7910 073631
> 
> Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE
> (http://www.journals.elsevier.com/computer-speech-and-language/)
> --------------------------------------------------------------------------------------------
> 
> 
> On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx> wrote:
> Has there been anything formal published on this effect?
>    https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illusion
> 
> It sounds to me like a semantic version of the McGurk effect.
> 
> Nice demo.
> 
> - Malcolm
>

References:
- [AUDITORY] Semantic McGurk Effect
  - From: Malcolm Slaney
- Re: [AUDITORY] Semantic McGurk Effect
  - From: Prof. Roger K. Moore

Prev by Date: Re: [AUDITORY] Semantic McGurk Effect
Next by Date: Re: [AUDITORY] Semantic McGurk Effect
Previous by thread: Re: [AUDITORY] Semantic McGurk Effect
Next by thread: Re: [AUDITORY] Semantic McGurk Effect
Index(es):
- Date
- Thread