Re: [AUDITORY] Semantic McGurk Effect

Subject: Re: [AUDITORY] Semantic McGurk Effect

From: "Prof. Roger K. Moore" <0000011559506d60-dmarc-request@xxxxxxxxxxxxxxx>

Date: Fri, 7 Aug 2020 09:38:29 +0100

Approved-by: r.k.moore@xxxxxxxxxxxxxxx

Arc-authentication-results: i=1; mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.101 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mcgill.ca

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-owner:list-subscribe:list-unsubscribe:list-help :precedence:in-reply-to:to:comments:subject:from:sender:reply-to :date:message-id:references:mime-version:approved-by; bh=QA3CzXhQQd3goY6aONgbJv94UjIaCbvoB4JErCdgMxg=; b=WsvkBLKMPK+ORWiUzJ9o2b+Y5usfNu+oBff4A770x5Iyn/ve8nnbbBPic5qNISjzBu s4Ipde5UDSE+S7PeFg1XE3uXCHyMnZ0PMyd7kf68D57DgdQPiiOghroOfm0LUdq2yCB4 pZhrGYpRYgkI6OEZZn3nYg1yR1LCBCvPz4vdHeL+pk0jNFSCpgTT3yP7+h26LJl33szn 5ZYTHyY8nfpQUynYN4hWHiLMurN0InhswQumgiq12CUmnG55HdOX15BHzylh9EPhf+qE bdMUnxga9u2XQ7RccyiDjgSySdDpsW8yjpGpDobdzumqkphFAgBmBDUq/2jsf2d66f+V S+yQ==

Arc-seal: i=1; a=rsa-sha256; t=1596792440; cv=none; d=google.com; s=arc-20160816; b=vQrgmxcMrW+gLeiQbnYGdOKHYITKgxdbjeBHw2oWXfIzhHi4BJZaFJhFxNYUZyhfyO oh5KLjqEbj3MNGg0fUwVqPEB1Z2811qzzf3WU8pkkbEAjtG9LPhB8PbjFcJPipxqVXOb ZLTUkMCwHZ7I9ulSSWhnPsc38WWdrMASbdhgeeP3owsulKj3rVtG61IPDu1qpBW8EeCN AGextw5S8QjJpVZeYQQDQ3/y8Czo35a+2UwNXyfu1AMrUvHz0NiGm3pBB7OXm8j3TAg1 k0KDvnNbRWJ4URNVPPHAIJE9HOwyqsinFqJ82tIlQl66BnwEmngEM5Uk6B/00MuH1vqO VsbQ==

Authentication-results: mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.101 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mcgill.ca

Comments: To: Malcolm Slaney <malcolm@xxxxxxxx>

Delivered-to: dan.ellis@xxxxxxxxx

In-reply-to: <F9FC5620-D6CA-49E3-B30F-0E9F0541F3D3@ieee.org>

List-archive: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

List-help: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>

List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>

List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>

List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>

References: <F9FC5620-D6CA-49E3-B30F-0E9F0541F3D3@ieee.org>

Reply-to: "Prof. Roger K. Moore" <r.k.moore@xxxxxxxxxxxxxxx>

Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

I must admit to being surprised by the surprise engendered by this video. Anyone who was around during the early days of text-to-speech synthesis is very aware of the danger of presenting the text in advance of or simultaneous with the generated speech. The intelligibility of the resulting synthesis could be zero without the 'prior' and 100% with the visual cue.

So, given that we know that perception involves the integration of top-down expectations with bottom-up evidence (going right back to Richard Warren's work on the 'phoneme restoration effect'), why is this TikTok demo surprising? Or maybe I'm missing something?

Best wishes

Roger

--------------------------------------------------------------------------------------------
Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET

Chair of Spoken Language Processing
Vocal Interactivity Lab (VILab), Sheffield Robotics
Speech & Hearing Research Group (SPandH)
Department of Computer Science, UNIVERSITY OF SHEFFIELD
Regent Court, 211 Portobello, Sheffield, S1 4DP, UK

* Winner of the 2016 Antonio Zampolli Prize for "Outstanding Contributions

to the Advancement of Language Resources & Language Technology

Evaluation within Human Language Technologies"

e-mail: r.k.moore@xxxxxxxxxxxxxxx
web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/

twitter: @rogerkmoore
Tel: +44 (0) 11422 21807
Fax: +44 (0) 11422 21810
Mob: +44 (0) 7910 073631

Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE
(http://www.journals.elsevier.com/computer-speech-and-language/)

--------------------------------------------------------------------------------------------

On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx> wrote:

Has there been anything formal published on this effect?
https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illusion

It sounds to me like a semantic version of the McGurk effect.

Nice demo.

- Malcolm