Subject: Re: auditory scrambling From: Valeriy Shafiro <Valeriy_Shafiro@xxxxxxxx> Date: Wed, 19 Dec 2007 14:19:48 -0600 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>Dear Pierre, Al and others, I can't say that I specifically heard more consonants after speech segments from a single utterance were shuffled, although I wasn't listening for it specifically. I think my method of shuffling segments was too crude and without precise controls to detect changes in specific phonemes (whatever the utterance language). In fact, I think I added a small onset/offset ramp for each segment to avoid clicks and pops (which could potentially also have some phonetic effects). However, based on some informal listening with several friends, I do recall that it was always possible to say that the shuffled utterance was speech if the segments were more than 50ms long. With smaller sizes it wasn't always clear. I got mixed responses as to how many talkers people heard in a shuffled utterance, and, I think there were some differences among utterances (e.g., length, sex). It was more like an impression of the number of talkers, rather than a clear cut number. It was easy to separate male and female voices. However, listeners weren't always sure if the language of the utterances was English or not. I included both Russian and English shuffled utterances. As would be expected, determining that the language in a shuffled utterance is not English was an easier task for Russian utterances than for English utterances. Finally, the rate of speech in shuffled utterances was always faster than the original utterance, although it depended on segment duration. Generally, the smaller the segment duration the faster the rate. Unfortunately, I haven't yet found the code I used, having done this about 10 years and about as many hard drive crashes ago, but it was straightforward and similar to what Ursula described. Best, Valeriy ------------------------------------------------------------- Valeriy Shafiro Communication Disorders and Sciences Rush University Medical Center Chicago, IL office (312) 942 - 3298 lab (312) 942 - 3316 email: valeriy_shafiro@xxxxxxxx -----AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxx> wrote: ----- To: AUDITORY@xxxxxxxx From: Pierre Divenyi <pdivenyi@xxxxxxxx> Sent by: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxx> Date: 12/19/2007 10:01AM Subject: Re: auditory scrambling Dear Valeriy, In light of Al's question and his reminding us of that interesting 30-years (yes, Chris!) old paper of Darwin and Bethell-Fox, I wonder if your scrambled speech effect would be even more powerful in your native Russian in which many consonants are palatalized and therefore contain more formant transitions than most (if not all) Indo-European and Ural-Altaic languages. Can a linguist colleague pitch in, please? Pierre At 04:24 PM 12/18/2007, Al Bregman wrote: >Dear Valeriy and list: > >The perceptual effects of shuffled speech that you reported made me >think of the fact that years ago, Chris Darwin and a colleague did an >experiment in which they studied the effects of an instantaneous >change in F0 from 101 Hz to 178 Hz in the middle of a synthesized >syllable > >Darwin, C.J., & Bethell-Fox, C.E. (1977) Pitch continuity and speech >source attribution. Journal of Experimental Psychology: Human >Perception and Performance, 3, 665-672. > >The formant patterns changed smoothly between two vowels, and when the >pitch was a monotone, the transitions were heard as semivowels and >liquids. But when a discontinuity of pitch was introduced in the >middle of the vowel transition, the listeners heard 2 separate speech >sources, saying stop consonants. These consonants probably appeared >because what came after the pitch change was completely dissociated >from what came before. So it sounded like the sudden end of one >vocalic sound and the sudden beginning of another. These offsets and >onsets were heard as consonants. > >Valeriy, do you hear any spurious consonants when you listen to your >rearranged segments? I should mention that your discontinuities are >more severe than those of Darwin and Bethell-Fox, because you are >breaking up formant transitions as well as F0 trajectories, while D & >B kept the formant transitions intact. > >Best, > >Al > >------------------------------------------------------------------- >Albert S. Bregman, Emeritus Professor >Psychology Department, McGill University >1205 Docteur Penfield Avenue >Montreal, QC, Canada H3A 1B1. > Tel: (514) 398-6103 > Fax: (514) 398-4896 >www.psych.mcgill.ca/labs/auditory/Home.html >------------------------------------------------------------------- > > > > >On Dec 18, 2007 1:13 PM, Valeriy Shafiro <Valeriy_Shafiro@xxxxxxxx> wrote: > > Hi Mathias, > > > > I did this some years ago in Matlab and also in Java (with less controls). > > I was trying to see how difficult putting speech,music or scenes together > > becomes based on segment duration and number. But, unfortunately it never > > got to a formal experiment. One curious effect I remember is that when you > > 'scramble' a speech utterance of a single talker into very short segments > > (I want to say 50 -100 ms long, but I don't remember precisely now), you > > actually hear more than one talker, which is not all that surprising I > > thought given the discontinuities introduced during scrambling. If you are > > interested, I can look if I still have the code. > > > > Best, > > > > Valeriy > > > > > > > > ------------------------------------------------------------- > > Valeriy Shafiro > > Communication Disorders and Sciences > > Rush University Medical Center > > Chicago, IL > > > > office (312) 942 - 3298 > > lab (312) 942 - 3316 > > email: valeriy_shafiro@xxxxxxxx > > > > > > > > -----AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxx> > > wrote: ----- > > > > > > To: AUDITORY@xxxxxxxx > > From: Mathias Oechslin <m.oechslin@xxxxxxxx> > > Sent by: AUDITORY - Research in Auditory Perception > > <AUDITORY@xxxxxxxx> > > Date: 12/18/2007 06:00AM > > Subject: auditory scrambling > > > > Dear list, > > > > Has anyone any experience with an automatic approach to "scramble" acousic > > stimuli? > > That means for example: first step, segmentation of a 4 secs phrase in 10 > > segments of 400ms: second step, rearragement in a random order. > > An advanced implementation would be to have the opportunity to define any > > possible time range (i.e 50-400ms), at which the script rearranges the file > > randomly. > > > > Thanks for any ideas, > > Mathias > > > > > > -- > > > > > > > > > > ************************************************** > > Mathias Oechslin > > Ph.D student > > Department of Neuropsychology > > Institute for Psychology > > Binzmühlestrasse 14/25 > > University of Zürich > > CH-8050 Zürich > > Switzerland > > http://www.psychologie.unizh.ch/neuropsy/ > > > > m.oechslin@xxxxxxxx > > phone: +41 44 635 74 07 > > > > > > ************************************************** > > > >--