Re: [AUDITORY] Semantic McGurk Effect

Subject: Re: [AUDITORY] Semantic McGurk Effect

From: "Arthur, Claire" <claire.arthur@xxxxxxxxxx>

Date: Fri, 7 Aug 2020 19:14:49 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=gatech.edu; dmarc=pass action=none header.from=gatech.edu; dkim=pass header.d=gatech.edu; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Yys9WjZpaVpYLcHJYRtx/yKzLN8FMeVQigwFYbhAX3U=; b=SRc4ZHYVAvlMcZ1Vhq+CsLgm+0xwdOZ8WkwM798VbpHoGplP4kbGFTYWq3nI39JFmyjy4sZhDOHqJSSWTokq4JUXIar6ENxwIZFLwq6OzvLGlCyYD9q7YQzMvQ/1+jVq0By/QkDtp0+gbNTqHKGbiDPqYl8N/WNluqTmmqJwfp7ojWdLqmpqNkM8vCSjhjyvof7Xf9JRwfvuMhilofiBrAvFdkkW4ChPiK6bjf2aVfG+3bBlP/Q8GA2Ivm/2PWSla9mDDyxDNQeWL6GdtQhahxAIbYPmEfDLWUbOcLPrAO1ZMBdlDqZLHfoYosXrwppBfa0Es90t/ud1d6L1cgiWCw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=b5rcE+xfTEyvOZS0Vk2bI5+4bsIHUYNbrf3GAtCxNTXhbAJuRRalA8CS4BG9jXTK7XQYisrakvHhaYrTVqjC7C3rtvonZKFOZA8v6W2ZAf/tkTVuIoDk8aN2MwY1yp/TCjxRBG9CWoj4iDIhhDakc18jipL7/2wH94W04J5iViXCF2UuqaTyAW/6/46jdBOCEWf18ekjgyJcUSm5FxQBj7ZOlxoQMR/gz/TMIaGSyMluWdJtu2Pnjnt50q43kKf2aB2W+5hTHJj9wjcNz096gEF20dksYMVS8R7PfuMHiBizMAJkilV/TFesVK899ZdJsOUGRXMi/D5MtguJk1ZPTg==

Authentication-results: mx.google.com; arc=fail (signature failed); spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.102 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gatech.edu

Comments: To: "Prof. Roger K. Moore" <r.k.moore@xxxxxxxxxxxxxxx>

Delivered-to: dan.ellis@xxxxxxxxx

In-reply-to: <CAKtBpfDGScf6KKDNdV=g+cgpQ65ABPdmWTtz4qBEEDrgJKto1A@mail.gmail.com>

List-archive: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

List-help: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>

List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>

List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>

List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>

References: <F9FC5620-D6CA-49E3-B30F-0E9F0541F3D3@ieee.org>,<CAKtBpfDGScf6KKDNdV=g+cgpQ65ABPdmWTtz4qBEEDrgJKto1A@mail.gmail.com>

Reply-to: "Arthur, Claire" <claire.arthur@xxxxxxxxxx>

Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

Thread-index: AQHWbHDzc5J3DwCpVkawRS6HmgPBbqksU12AgACxg6Q=

Thread-topic: [AUDITORY] Semantic McGurk Effect

I believe because of the odd timbre/background noise, our text priming helps us choose what to parse as the "negative space". So, for example, you hear the "ee" from "needle" if you are trying to hear that word, whereas you simply ignore that "ee" as it falls in between "brain" and "storm" if you are listening for brainstorm. Fun! Thanks for sharing.

Claire

Claire Arthur
Assistant Professor, School of Music
College of Design
Georgia Institute of Technology
(404) 894-9110
claire.arthur@xxxxxxxxxx

From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx> on behalf of Prof. Roger K. Moore <0000011559506d60-dmarc-request@xxxxxxxxxxxxxxx>
Sent: Friday, August 7, 2020 4:38 AM
To: AUDITORY@xxxxxxxxxxxxxxx <AUDITORY@xxxxxxxxxxxxxxx>
Subject: Re: [AUDITORY] Semantic McGurk Effect

I must admit to being surprised by the surprise engendered by this video. Anyone who was around during the early days of text-to-speech synthesis is very aware of the danger of presenting the text in advance of or simultaneous with the generated speech. The intelligibility of the resulting synthesis could be zero without the 'prior' and 100% with the visual cue.

So, given that we know that perception involves the integration of top-down expectations with bottom-up evidence (going right back to Richard Warren's work on the 'phoneme restoration effect'), why is this TikTok demo surprising? Or maybe I'm missing something?

Best wishes

Roger

--------------------------------------------------------------------------------------------
Prof ROGER K MOORE* BA(Hons) MSc PhD FIOA FISCA MIET

Chair of Spoken Language Processing
Vocal Interactivity Lab (VILab), Sheffield Robotics
Speech & Hearing Research Group (SPandH)
Department of Computer Science, UNIVERSITY OF SHEFFIELD
Regent Court, 211 Portobello, Sheffield, S1 4DP, UK

* Winner of the 2016 Antonio Zampolli Prize for "Outstanding Contributions

to the Advancement of Language Resources & Language Technology

Evaluation within Human Language Technologies"

e-mail: r.k.moore@xxxxxxxxxxxxxxx
web: http://staffwww.dcs.shef.ac.uk/people/R.K.Moore/

twitter: @rogerkmoore
Tel: +44 (0) 11422 21807
Fax: +44 (0) 11422 21810
Mob: +44 (0) 7910 073631

Editor-in-Chief: COMPUTER SPEECH AND LANGUAGE
(http://www.journals.elsevier.com/computer-speech-and-language/)

--------------------------------------------------------------------------------------------

On Fri, 7 Aug 2020 at 05:12, Malcolm Slaney <malcolm@xxxxxxxx> wrote:

Has there been anything formal published on this effect?
https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illusion

It sounds to me like a semantic version of the McGurk effect.

Nice demo.

- Malcolm