Re: [AUDITORY] Semantic McGurk Effect

Owen’s point reminds me of a discussion on this list, maybe 20 years or so ago. Someone remarked that when on a plane, listening to the engines warm up, they could sometimes hear tunes in that sound. One could to some extent choose which tunes to hear. It’s another example of the general phenomenon that one can get percepts that plausibly map on to more than one stimulus, and that the brain uses top-down information to carve out one route in that map.

I also like Julie’s point: the demographic overlap between readers of early text-to-speech articles and users of TikTok may not be as great as many of us would wish J

Bob

PS Remember planes?

From: Owen Brimijoin <owen.brimijoin@xxxxxx>
Sent: 08 August 2020 06:30
To: Bob Carlyon <Bob.Carlyon@xxxxxxxxxxxxxxxxx>; AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Semantic McGurk Effect

It definitely works with non words. If you play continuous noise with randomized spectra to people and ask them to press a button when they hear a particular vowel sound in there, they quite happily do so. Even though you never actually intentionally embed any vowel sound in the noise, when you average the spectrograms of the signal just before each of all of the hundreds of button presses, hey presto you see the spectrum of the vowel they were listening for.

It’s a little spooky. And very cool.

Link to shameless self promotion:

https://asa.scitation.org/doi/10.1121/1.4778264

- Owen.

From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx> on behalf of Bob Carlyon <Bob.Carlyon@xxxxxxxxxxxxxxxxx>
Sent: Friday, August 7, 2020 1:49:05 AM
To: AUDITORY@xxxxxxxxxxxxxxx <AUDITORY@xxxxxxxxxxxxxxx>
Subject: Re: [AUDITORY] Semantic McGurk Effect

Hi Malcolm

Nice video. Kind of you to shave your legs for our benefit.

I think this is an example of the general finding that prior information affects the perception of degraded speech, which has been extensively investigated with vocoded speech. When vocoded with few channels this can sound intelligible, but sounds clear and obvious when preceded by either written or spoken (clear speech) versions of the original. It’s been found that this kind of clear-then-distorted exposure speeds up learning so that it is easier to recognise new sentences vocoded in the same way:

Davis MH, Johnsrude IS, Hervais-Adelman A, Taylor K, McGettigan C (2005) Lexical information drives perceptual learning of distorted speech: Evidence from the comprehension of noise-vocoded sentences. Journal of Experimental Psychology-General 134:222-241.

The difference between the video and most published studies is that the distorted speech could plausibly ‘map onto’ one of two sentences. My colleague Matt Davis, in a public science lecture, had the audience play “vocoder bingo”, in which Matt played several vocoded words, and the audience were all given cards with some words written on, and had to tick off each word when they heard it. Much excitement ensued (I’m not sure what the prize was), but the catch was that none o fthe words were actually presented – they were just sufficiently similar/ambiguous to be convincing when paired with the written text. Matt tells me that they have published an imaging study, looking at brain responses to ambiguous vocoded words when cued to hear them one way or another (e.g. ‘pit’ or ‘kitsch’). This may be the closest published work to what you ask for:

Blank, H., Spangenberg, M., Davis, M.H. (2018) Neural prediction errors distinguish perception and misperception of speech. Journal of Neuroscience, 38 (27) 6076-6089

https://www.jneurosci.org/content/38/27/6076

Finally, I suspect that this is not a semantic effect as I expect it would work with non-words

All the best,

Bob

From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx> On Behalf Of Malcolm Slaney
Sent: 07 August 2020 00:59
To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Semantic McGurk Effect

Has there been anything formal published on this effect?

https://www.iflscience.com/brain/what-the-hell-is-going-on-in-this-tiktok-audio-illusion

It sounds to me like a semantic version of the McGurk effect.

Nice demo.

- Malcolm