[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: auditory scrambling
Dear Pierre, Al and others,
I can't say that I specifically heard more consonants after speech segments
from a single utterance were shuffled, although I wasn't listening for it
specifically. I think my method of shuffling segments was too crude and
without precise controls to detect changes in specific phonemes (whatever
the utterance language). In fact, I think I added a small onset/offset
ramp for each segment to avoid clicks and pops (which could potentially
also have some phonetic effects). However, based on some informal
listening with several friends, I do recall that it was always possible to
say that the shuffled utterance was speech if the segments were more than
50ms long. With smaller sizes it wasn't always clear. I got mixed
responses as to how many talkers people heard in a shuffled utterance, and,
I think there were some differences among utterances (e.g., length, sex).
It was more like an impression of the number of talkers, rather than a
clear cut number. It was easy to separate male and female voices.
However, listeners weren't always sure if the language of the utterances
was English or not. I included both Russian and English shuffled
utterances. As would be expected, determining that the language in a
shuffled utterance is not English was an easier task for Russian utterances
than for English utterances. Finally, the rate of speech in shuffled
utterances was always faster than the original utterance, although it
depended on segment duration. Generally, the smaller the segment duration
the faster the rate.
Unfortunately, I haven't yet found the code I used, having done this about
10 years and about as many hard drive crashes ago, but it was
straightforward and similar to what Ursula described.
Best,
Valeriy
-------------------------------------------------------------
Valeriy Shafiro
Communication Disorders and Sciences
Rush University Medical Center
Chicago, IL
office (312) 942 - 3298
lab (312) 942 - 3316
email: valeriy_shafiro@xxxxxxxx
-----AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>
wrote: -----
To: AUDITORY@xxxxxxxxxxxxxxx
From: Pierre Divenyi <pdivenyi@xxxxxxxxx>
Sent by: AUDITORY - Research in Auditory Perception
<AUDITORY@xxxxxxxxxxxxxxx>
Date: 12/19/2007 10:01AM
Subject: Re: auditory scrambling
Dear Valeriy,
In light of Al's question and his reminding us of
that interesting 30-years (yes, Chris!) old paper
of Darwin and Bethell-Fox, I wonder if your
scrambled speech effect would be even more
powerful in your native Russian in which many
consonants are palatalized and therefore contain
more formant transitions than most (if not all)
Indo-European and Ural-Altaic languages.
Can a linguist colleague pitch in, please?
Pierre
At 04:24 PM 12/18/2007, Al Bregman wrote:
>Dear Valeriy and list:
>
>The perceptual effects of shuffled speech that you reported made me
>think of the fact that years ago, Chris Darwin and a colleague did an
>experiment in which they studied the effects of an instantaneous
>change in F0 from 101 Hz to 178 Hz in the middle of a synthesized
>syllable
>
>Darwin, C.J., & Bethell-Fox, C.E. (1977) Pitch continuity and speech
>source attribution. Journal of Experimental Psychology: Human
>Perception and Performance, 3, 665-672.
>
>The formant patterns changed smoothly between two vowels, and when the
>pitch was a monotone, the transitions were heard as semivowels and
>liquids. But when a discontinuity of pitch was introduced in the
>middle of the vowel transition, the listeners heard 2 separate speech
>sources, saying stop consonants. These consonants probably appeared
>because what came after the pitch change was completely dissociated
>from what came before. So it sounded like the sudden end of one
>vocalic sound and the sudden beginning of another. These offsets and
>onsets were heard as consonants.
>
>Valeriy, do you hear any spurious consonants when you listen to your
>rearranged segments? I should mention that your discontinuities are
>more severe than those of Darwin and Bethell-Fox, because you are
>breaking up formant transitions as well as F0 trajectories, while D &
>B kept the formant transitions intact.
>
>Best,
>
>Al
>
>-------------------------------------------------------------------
>Albert S. Bregman, Emeritus Professor
>Psychology Department, McGill University
>1205 Docteur Penfield Avenue
>Montreal, QC, Canada H3A 1B1.
> Tel: (514) 398-6103
> Fax: (514) 398-4896
>www.psych.mcgill.ca/labs/auditory/Home.html
>-------------------------------------------------------------------
>
>
>
>
>On Dec 18, 2007 1:13 PM, Valeriy Shafiro <Valeriy_Shafiro@xxxxxxxx> wrote:
> > Hi Mathias,
> >
> > I did this some years ago in Matlab and also in Java (with less
controls).
> > I was trying to see how difficult putting speech,music or scenes
together
> > becomes based on segment duration and number. But, unfortunately it
never
> > got to a formal experiment. One curious effect I remember is that when
you
> > 'scramble' a speech utterance of a single talker into very short
segments
> > (I want to say 50 -100 ms long, but I don't remember precisely now),
you
> > actually hear more than one talker, which is not all that surprising I
> > thought given the discontinuities introduced during scrambling. If you
are
> > interested, I can look if I still have the code.
> >
> > Best,
> >
> > Valeriy
> >
> >
> >
> > -------------------------------------------------------------
> > Valeriy Shafiro
> > Communication Disorders and Sciences
> > Rush University Medical Center
> > Chicago, IL
> >
> > office (312) 942 - 3298
> > lab (312) 942 - 3316
> > email: valeriy_shafiro@xxxxxxxx
> >
> >
> >
> > -----AUDITORY - Research in Auditory Perception
<AUDITORY@xxxxxxxxxxxxxxx>
> > wrote: -----
> >
> >
> > To: AUDITORY@xxxxxxxxxxxxxxx
> > From: Mathias Oechslin <m.oechslin@xxxxxxxxxxxxxxxxxxxx>
> > Sent by: AUDITORY - Research in Auditory Perception
> > <AUDITORY@xxxxxxxxxxxxxxx>
> > Date: 12/18/2007 06:00AM
> > Subject: auditory scrambling
> >
> > Dear list,
> >
> > Has anyone any experience with an automatic approach to "scramble"
acousic
> > stimuli?
> > That means for example: first step, segmentation of a 4 secs phrase in
10
> > segments of 400ms: second step, rearragement in a random order.
> > An advanced implementation would be to have the opportunity to define
any
> > possible time range (i.e 50-400ms), at which the script rearranges the
file
> > randomly.
> >
> > Thanks for any ideas,
> > Mathias
> >
> >
> > --
> >
> >
> >
> >
> > **************************************************
> > Mathias Oechslin
> > Ph.D student
> > Department of Neuropsychology
> > Institute for Psychology
> > Binzmühlestrasse 14/25
> > University of Zürich
> > CH-8050 Zürich
> > Switzerland
> > http://www.psychologie.unizh.ch/neuropsy/
> >
> > m.oechslin@xxxxxxxxxxxxxxxxxxxx
> > phone: +41 44 635 74 07
> >
> >
> > **************************************************
>
>
>
>--