Re: Al's Experiment ("Robert E. Remez" )


Subject: Re:      Al's Experiment
From:    "Robert E. Remez"  <REMEZ(at)PARADISE.BARNARD.COLUMBIA.EDU>
Date:    Tue, 22 Sep 1992 21:03:44 -0400

I am replying to this query: >>Are the three tones, at the formants, fused...? >>Jont Allen When the time-varying frequencies of the formant centers are replicated by sinusoids, a variety of perceptual effects are noted: o listeners identify the resulting tone-complexes as several simultaneously varying tones, as radio interference, as bad electronic music (as opposed to "good" electronic music, one must suppose), as equipment failure, as experimenter error, etc. Impressions reported by subjects seem to describe the auditory forms as such, or offer hypothetical mechanical events that might have caused such sounds; o listeners do recognize the linguistic properties of sinewave replicas once asked to attend to them as "synthetic speech;" o the phonetic effects of the tonal analogs are not available from tones presented as singletons; the first and second formant analogs must be presented as an ensemble for listeners to obtain any phonetic effects; o slight departures from natural time-variation in sinewave replicas destroys the phonetic coherence; o the quality of the sinewave voice is reported to be unnatural, far more unnatural than signals produced by conventional speech synthesis; o listeners show evidence of scale normalization appropriate to vocal tract size when they perceive sinewave vowels; o the intonation of sinewave sentences (lacking comodulated formants, the natural source of intonation--the fundamental frequency of phonation--is simply absent from sinewave replicas of speech) is attributable to the multiple use of the analog of the first formant; apparently, it is responsible for phonetic information approximate to the lowest oral resonance of natural speech, and it is responsible for the pitch contour of the sentence; whether it is also heard as an auditory form without phonetic attributes is yet to be determined; o listeners can simultaneously resolve to the auditory form of the tone analog of the second formant in a sinewave word as they resolve its phonetic effects; this is a kind of duplex perception at the heart of the recent to and fro inspired by Al Bregman this week; o unlike the multistable percepts in the visual system, which alternate--recall the reversals of the Rubin vase, Schroeder staircase, Necker cube--the multistable perception of a sinewave word is simultaneous, not successive: one is phonetic (the word), the other an impression of the auditory forms (several tones changing in pitch and loudness). (Confidential to Pierre Divenyi: This mini-tutorial on the perception of sinewave replicas should REALLY clear out the uncertain subscribers...) Remez -------


This message came from the mail archive
http://www.auditory.org/postings/1992/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University