[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
PHASES/PITCH/BRIGHTNESS discussion
Dear List,
=20
My queries related to the influence of phase on pitch and timbre, and also
the "brightness vs. pitch" problem generated a lot of responses and
discussions that were very interesting and informative.=20
I would surely not be able to get all this information in another way -
thanks to the list members.
I had received messages asking to collect and post all the discussion
materials. I did it partly once, but the discussion was not yet finished.
So, I am posting it all now.=20
Alexander Galembo
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Alex Galembo
Galembo@pavlov.psyc.queensu.ca
Hi,
I want to know about published and reported demonstrations of the timbral
importance of the phase spectrum of a multicomponent tone.=20
I know main works, but there are not many, and I am afraid to miss anything
important on the topic.
I will appreciate any info.
Thank you,
Alexander Galembo
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Alex Galembo
Dear Listers,
It is classical that the pitch of a periodic complex tone is independent on
phases of harmonics.
I would appreciate to be informed about any publications providng a doubt
in this phase independence (if exist).
Thank you,
Alex Galembo
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Peter Cariani
Hi Alex,
Several years ago I went through the literature on phase effects in
conjunction with our work on population-interspike interval representations
of the pitches of complex tones.
Cariani, Peter A., and Bertrand Delgutte. 1996. Neural correlates of the
pitch of complex tones. I. Pitch and pitch salience. II. Pitch shift, pitch
ambiguity, phase-invariance, pitch circularity, and the dominance region
for pitch.=20
J. Neurophysiology 76 (3) : 1698-1734. (2 papers)
What I concluded from my readings was that:
0. Phase structure is much more important for nonstationary sounds (in
which a particular phase structure is not repeated at some fixed recurrence
time) than for stationary ones (where a particular phase structure is
repeated at some fixed recurrence time, 1/F0).
For nonstationary sounds, phase structure is very important for
timbre (as Roy Patterson has demonstrated).
1. For stationary sounds, phase does not seem to affect the pitch of sounds
with lower frequency harmonics (say below 1-2 kHz).=20
For stationary sounds, phase also does not seem to affect
the timbre of sounds with lower frequency harmonics.
E.g. I think it's v. hard to alter either the pitch
or timbre of vowels by altering the phase spectrum.
However, phase spectrum can affect the salience (strength)
of the pitch that is heard. (A waveform with a higher peak
factor probably generates more F0-related intervals in
high-CF regions).
2. Phase has limited effects for higher frequency harmonics. Only
special phase manipulations alter the pitch of such complexes,
and when they do, they result in octave shifts (up). There seems to
be no way that one can get arbitrary pitch shifts from phase
manipulations (someone correct me if I'm wrong).
In terms of interspike interval models, the intervals produced by
higher frequency harmonics are related mainly to the envelopes
of the cochlear filtered stimulus waveform. Phase alterations
that give rise to the octave jumps do so by halving envelope=
periods,
thereby producing intervals at 2*F0 (or potentially, n*F0).
One could think of the Flanagan-Gutman alternating polarity click
trains and the Pierce tone pip experiments in these terms. For high
frequency components, these phase manipulations produce envelopes
with large modulations at multiples of F0, and the intervals
produced follow these envelopes. In our study of pitch in the
auditory nerve (above), we observed that if you consider only
fibers with CF's above 2 kHz (as would be the ANF subpopulation=
mainly
excited by a high-pass filtered alternating click train,
where these effects are most pronounced), the most frequent
interspike interval corresponds to the click rate (here 2*F0)
rather than the true fundamental (F0). THis corresponds with what
is heard.
However, if one takes the entire ANF population (all CF's), the
predominant interval is always at 1/F0, which is not what is heard
at low click rates (one hears a pitch at the click rate, an octave
above F0). My thinking on this is that intra-channel
interspike intervals may not be the whole story; that for such
stimuli (esp. under high-pass filtering) strong interchannel
interval patterns and synchronies are set up, and these might also
play a factor in the central interval analysis.
3. Despite the largely phase-invariant nature of our perception of
stationary sounds, this doesn't mean that phase isn't important.
If one takes a segment of noise of 5 msec long and repeats
it many times, one will hear a pitch at 200 Hz. If you scramble
the phase spectrum of the noise segment in each period, you will
no longer hear the repetition pitch. (One can do a similar
periodicity-detection experiment with random click trains with
recurrence times of seconds.)
I therefore think that phase coherence is important even
for those aspects of auditory
perception that appear to be largely insensitive to which
particular phase configuration is chosen.
According to an all-order interval-based theory, one needs
constant phase relations spanning at least 2 periods to
preferentially create intervals related to the
repetition period.
There is even a more general way of thinking about
detection of periodicity that involves the fusing together of
phase-relations that are constant into auditory objects, and
separating those relations that continually change. If we
think of 2 diff. vowels with diff. F0's added together, the
composite waveform contains 2 sets of internally-invariant
phase relations (two periods of each vowel's waveform)
plus the changing phase relations between the
two vowel periods (pitch period asynchronies). If one had a
means of detecting invariant phase structure, then one could
separate these two auditory objects. I think Roy Patterson's strobed
auditory image model moves in this direction, as do the kinds of
recurrent timing models I am working on.
Because of phase-locking of auditory nerve fibers, the timings of
individual spike discharges provide a representation of the
running stimulus phase spectrum. Interspike interval distributions
are then one way of neurally representing recurrent phase relations.
The formation of interval patterns depends crucially upon
phase structure, but once intervals are formed,
then the resulting representations are phase-independent.
--Peter Cariani
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
From: B. Suresh Krishna
Hi
Wrt your query on the list, could you please summarize the responses and
post them, or else send me a file with all the responses ? I would be very
interested in the answers myself.=20
Thanks !!
Suresh
"B. Suresh Krishna" <suresh@cns.nyu.edu>
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Bill Hartmann
Sasha,
The tone color depends on phases, but I don't know of any claim for a
reliable pitch effect. =20
Because pitch is manipulable, it is likely that individual listeners get
phase effects. A systematic effect might be interesting.
Bill
HARTMANN@pa.msu.edu
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Adrian Houtsma
Dear Alex,
Look, for instance, at Houtsma & Smurzynski, J. Acoust. Soc. Am. 87, 304,
1990. Figure 3 shows that pitch discrimination for resolved harmonics
(N<10) is rather independent of phase relations (in this case sine-phase vs
Schroeder-phase). For unresolved harmonics (N>10) discrimination is (1)
more difficult since jnds are much larger, and (2) even more difficult for
Schroeder-phase tones than for sine-phase tones.
Similar evidence can be found in the dissertation of Hoekstra, cited in that
paper.
The general rule of thumb is: resolved harmonics > no phase sensitivity
unresolved harmonics > large phase sensitivity
Best wishes,
Adrian Houtsma
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Shlomo Dubnov
Hi,
Well, it dependes what is your criteria / definition for pitch. There are
works that use phase information to determine if a signal is pitch or noise
(more precisely, if it is voiced or unvoiced signal in speech).=20
In case this is relevant for your question, I can provide you with=
references.
best, =20
--=20
Shlomo Dubnov ---------- e-mail: dubnov@ircam.fr
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Leslie Smith
Dear Alexander:
There are occasions when the phase can affect the timbre (and the =3D
pitch too) when the harmonics are close together, so that they are =3D
not resolved at the cochlea. Then the envelope shape (as perceived =3D
at the hair cells) will depend on the phase (given 3 or more =3D
harmonics).
See
Smith L.S. Data-driven Sound Interpretation: its Application to =3D
Voiced Sounds. pp147-154, in Neural Computation and Psychology : =3D
Proceedings of the 3rd Neural Computation and Psychology Workshop =3D
(NCPW3), Stirling, Scotland, 31 August - 2 September 1994, editors =3D
L.S. Smith and P.J.B. Hancock. Springer Verlag: Workshops in =3D
Computing Series, 1995
but an earlier (and better!) reference (which I was not aware of =3D
when I wrote the above) is
Moore B.C.J., Effects of relative phase of the components on the =3D
pitch of three-component complex tones, in Psychophysics and =3D
physiology of hearing, edited by E.F. Evans and J.P. Wilson, =3D
Academic Press, 1977.
--leslie smith
l.s.smith@cs.stir.ac.uk =20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: louisew@biols.susx.ac.uk (Louise White)
Alex,
Try Shackleton and Carlyon JASA 95 (6) June 1994 3529-3540
Louise
louisew@biols.susx.ac.uk (Louise White)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Brian C. J. Moore
Look at:
Moore, B. C. J. (1977). Effects of relative phase of the components on the
pitch of three-component complex tones. In Psychophysics and Physiology of
Hearing, (ed. E. F. Evans and J. P. Wilson), pp. 349-358. Academic Press,
London.
Brian C. J. Moore, Ph.D.
bcjm@pop.cus.cam.ac.uk
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
From: Richard Lyon
John Pierce published in JASA about 5 years ago on some studies of signals
that had pitch dependent on phase (changing by two octaves, not a small
shift). And it has references to earlier work by some of his buddies at
Bell (but I don't recall which ones right now). Look for him in JASA index.
D =20
lyon@pop.ricochet.net
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Bill Schottstaedt
I think John Pierce has written a couple papers on that subject:
Pierce, J. R. (1990).=20
Rate, place, and pitch with tonebursts.=20
Music Perception , 7(3):205-212.=20
Pierce, J. R. (1991a ).=20
Periodicity and pitch perception.=20
Journal of the Acoustical Society of America , 90:1889-1893.=20
bil@ccrma.Stanford.EDU (Bill Schottstaedt)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Richard Parncutt
How about...
Langner, G. (1997). Temporal processing of pitch in the auditory system.
Journal of New Music Research, 26, 116-132.
Langner, G., & Schreiner, C.E. (1988). Periodicity coding in the inferior
colliculus of the cat (I) - Neuronal mechanisms. Journal of Neurophysiology,
60, 1799-1822.
Meddis, R., & Hewitt, M.J. (1991a). Virtual pitch and phase sensitivity of a
computer model of the auditory periphery. Journal of the Acoustical Society
of America, 89, 2866-2894.
Patterson, R.D. (1973). The effects of relative phase and the number of
components on residue pitch. Journal of the Acoustical Society of America,
53, 1565-1572.
That reminds me! I'd be grateful for brief, private, independent assessment
of Langner's work.
Richard Parncutt, Email: r.parncutt@keele.ac.uk.=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Bob Carlyon
Prof. Galembo,
Trevor Shackleton and I did some experiments showing that, compared to a
sine-phase stimulus, and alternating-phase stimulus has a pitch about 1
octave higher, provided that its harmonics are unresolved by teh peripheral
auditory system. We also review earlier papers showing similar
pitch-doubling effects. The article is published in jasa vol 95 p3529-3540
(1994)
regards
bob carlyon
email: bob.carlyon@mrc-apu.cam.ac.uk
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Malcolm Slaney
At 6:28 AM -0800 2/4/98, Alexander Galembo wrote:
>I would appreciate to be informed about any publications providng a doubt
>in this phase independence (if exist).
I think the first were Flanagan and Guttman (1960). I'm not sure if they
described their observations as a phase change, but they are. Pierce, a
few years ago, redid the experiments.
A description of this stimulai that Richard Duda wrote for the Apple
Hearing Demo Reel is appended to the end of my note.
Malcolm
This animation was produced in conjunction with Richard Duda of the
Department of Electrical Engineering at San Jose State University during
the Summer of 1989. Thanks to Richard Duda for both the audio examples and
the explanation that follows and to John Pierce for calling this experiment
to our attention.
Researchers in psychoacoustics have long looked to cochlear models to
explain the perception of musical pitch [Small70]. Many experiments have
made it clear that the auditory system has more than one mechanism for
pitch estimation. In one of these experiments, Flanagan and Guttman used
short-duration impulse trains to investigate two different mechanisms for
matching periodic sounds, one based on spectrum and one based on pulse rate
[Flanagan60]. They used two different impulse trains, one having one pulse
per period of the fundamental, the other having four pulses per period,
every fourth pulse being negative . These signals have the interesting
property that they have the same power
spectrum, which seems to suggest that they should have the same pitch. The
standard conclusion, however, was that below 150 pulses per
second the trains "matched" if they had the same pulse rate; they "matched"
on spectrum only when the fundamental frequency was above about
200 Hz.
[Pierce89] modified this experiment by replacing the pulses by tone
bursts=F3short periods of a 4,800-Hz sine wave modulated by a raised-cosine
Hamming window. In essence, he used Flanagan and Guttman's pulses to
amplitude modulate a steady high-frequency carrier. His purpose in
doing this was to narrow the spectrum, keeping the large response of the
basilar membrane near one place (the 4,800-Hz place), regardless of
pulse rate.
To be more specific, Pierce used the three signal "patterns" shown below.
All have the same burst duration, which is one-eighth of a pattern
period. Pattern a has four bursts in a pattern period. Pattern b has the
same burst rate or pulse rate, but every fourth burst is inverted in phase.
Thus, the fundamental frequency of b is a factor of four or two octaves
lower than that of a. Pattern c has only one burst per pattern period,
and thus has the same period as b; in fact, it can be shown that b and c
have the same power spectrum. Thus, a and b sound alike at low
pulse rates where pulse-rate is dominant, and b and c sound alike at high
pulse rates where spectrum is dominant. Pierce observed that the ear
matches a and b for pattern frequencies below 75 Hz, and matches b and c
for pattern frequencies above 300 Hz. He found the interval
between 75 and 300 Hz to be ambiguous, the b pattern being described as
sounding inharmonic.
Pierce's tone bursts. Patterns a and b have the same pulse rate
frequency, while b and c have the same power spectrum. Here the
test sounds are shown with one cycle per burst.
To see if and how these results are reflected in correlograms, a similar
set of tone burst signals were generated. The only difference between our
signals and Pierce's signals was due to differences in the digital sampling
rate used. To get a Fourier spectrum with minimum spectral splatter,
Pierce imposed two requirements:
1)The tone-burst frequency fb was set at half the Nyquist
rate. Where Pierce's 19,200-Hz sampling rate led to fb =3D 4,800 Hz,
our 16,000-Hz sampling rate forced fb down to 4,000 Hz.
2)Each burst had to contain an exact integral number n of
cycles. This number, n, is a major parameter for the experiments,
ranging from 1 to 128. If the pattern period is T, then to
obtain exactly n cycles of frequency fb in time T/8 requires that fb T/8 =3D
n, so that T =3D 8n/fb .
Thus, to obtain the same spectral characteristics, we had to use different
numerical values for the tone-burst frequency fb and the corresponding
pattern period T. The table shown below is our version of Table I in
Pierce's paper.
A set of eight test signals was generated according to this scheme. Each
test signal consists of a sequence of the a, b and c patterns, each
pattern lasting 1.024 seconds. This time interval was chosen to get an
exact integer number of bursts, ranging from 4 for Case 1c to 2000 for
Cases 8a and 8b.
Malcolm Slaney malcolm@interval.com
=3D-=3D=3D=3D=3D=3D=3D
From: "R. Parncutt" <psa03@CC.KEELE.AC.UK>
Subject: effect of phase on pitch
Pondering the evolutionary origins of the ear's "phase deafness" in most
naturally occurring sounds, I have come up with the following argument. Does
it make sense? Is there other literature on this subject that I have missed?
:::
In everyday listening environments, phase relationships are typically
jumbled unrecognizably when sound is reflected off environmental objects;
that is, when reflected sounds of varying amplitudes (depending on the
specific configuration and physical properties of the reflecting materials)
are added onto sound traveling in a direct line from the source. Thus, phase
information does not generally carry information that can reliably aid a
listener in identifying sound sources in a reverberant environment
(Terhardt, 1988; see also Terhardt, 1991, 1992). This is a matter of
particular concern in an ecological approach, as non-reverberant
environments are almost non-existent in the real world (anechoic rooms,
mountain tops). On the other hand, again in real acoustic environments,
spectral frequencies (that is, the frequencies of isolated components of
complex sounds, or clear peaks in a running spectrum, forming frequency
trajectories in time-varying sounds) cannot be directly affected by
reflection off, or transmission through, environmental obstacles. They might
be indirectly affected as a byproduct of the effect that such manipulations
can have on amplitudes (e.g., a weakly defined peak could be pushed sideways
if amplitudes increased on one side and decreased on the other), but such
phenomena could hardly affect audible sound spectra.
So for the auditory system to reliably identify sound sources, it needs to
ignore phase information, which is merely a constant distraction, and focus
as far as possible on a signal's spectral frequencies (and to a lesser
extent on the relative amplitudes of individual components, keeping in mind
that these, too, are affected by reflection and transmission). The ear's
phase deafness with regard to pitch perception is thus a positive attribute.
In fact, it may be regarded as an important phylogenetic achievement - the
result of a long evolutionary process in which animals whose ears allowed
phase relationships to interfere with the identification of dangerous or
otherwise important sound sources died before they could reproduce. If this
scenario is correct, then it is no surprise that we are highly sensitive to
small changes in frequency, and highly insensitive to phase relationships
within complex sounds.
Straightforward evidence of the ear's insensitivity to phase in the sounds
of the real human environment has been provided by Heinbach (1988). He
reduced natural sounds including speech (with or without background noise
and multiple speakers) and music to their spectral contours, which he called
the part-tone-time-pattern. In the process, he completely discarded all
phase information. The length of the spectrum analysis window was carefully
tuned to that of the ear, which depends on frequency. Finally, he
resynthesized the original sounds, using random or arbitrary phase
relationships. The resynthesized sounds were perceptually indistinguishable
from the originals, even though their phase relationships had been shuffled.
It is nevertheless possible to create artificial stimuli for which clear,
significant perceptual effects of phase relationships on perception can be
demonstrated. For example, Patterson (1973, 1987) demonstrated that
listeners can discriminate two harmonic complex tones on the basis of phase
relationships alone. Moore (1977) demonstrated that the relative phase of
the components affects the pitch of harmonic complex tones consisting of
three components; for each tone, there were several possible pitches, and
relative phase affected the probability of a listener hearing one of those
as 'the' pitch. Hartmann (1988) demonstrated that the audibility of a
partial within a harmonic complex tone depends on its phase relationship
with the other partials. Meddis & Hewitt (1991b) succeeded in modeling these
various phase effects, which (as Moore, 1977, explained) generally apply
only to partials falling within a single critical band or auditory filter.
In an ecological approach, the existence of phase sensitivity in such
stimuli (or such comparisons between stimuli) might be explained as follows.
These stimuli (or stimulus comparisons) do not normally occur in the human
environment. So the auditory system has not had a chance to'learn' (e.g.,
through natural selection) to ignore the phase effects. As hard as the ear
might 'try' to be phase deaf in the above cases, some phase sensitivity will
always remain, for unavoidable physiological reasons.
There could, however, be some survival value associated with the ability to
use phase relationships to identify sound sources during the first few tens
of ms of a sound, before the arrival of interference from reflected waves in
typical sound environments. On this basis, we might expect phase
relationships at least to affect timbre, even in familiar sounds. Supporting
evidence for this idea in the case of synthesized musical instrument sounds
has recently been provided by Dubnov & Rodet (1997). In the case of speech
sounds, Summerfield & Assmann (1990) found that pitch-period asynchrony
aided in the separation of concurrent vowels; however, the effect was
greater for less familiar sounds (specifically, it was observed at
fundamental frequencies of 50 Hz but not 100 Hz). In both cases, phase
relationships affected timbre but not pitch.
The model of Meddis & Hewitt (1991a) is capable of accounting for known
phase dependencies in pitch perception (Meddis & Hewitt, 1991b). This raises
the question: why might it be necessary or worthwhile to model something
that does not have demonstrable survival value for humans (whereas music
apparently does have survival value, as evidenced by the universality of
music in human culture). As Bregman (1981) pointed out, we need to "think
about the problems that the whole person faces in using the information
available to his or her sense organs in trying to understand an environment"
(p. 99). From this point of view, the human ear might be better off without
any phase sensitivity at all. Bregman goes on to say that "Because
intelligent machines are required actually to work and to achieve useful
results, their designers have been forced to adopt an approach that always
sees a smaller perceptual function in terms of its contribution to the
overall achievement of forming a coherent and useful description of the
environment." So if one were building a hearing robot, there would be no
point in incorporating effects of phase on pitch perception, if such effects
did not help the robot to identify sound sources.
Bregman, A.S. (1981). Asking the 'What for?' question in auditory
perception. In M. Kubovy & J. R. Pomerantz (Eds.), Perceptual organization
(pp. 99-118). Hillsdale, N.J.
Dubnov, S., & Rodet, X. (1907). Statistical modeling of sound
aperiodicities. Proceedings of the International Computer Music Conference,
Thessaloniki, Greece, (pp. 43-50).
Hartmann, W. (1988). Pitch perception and the segregation and integration of
auditory entities. In G. M. Edelman, W. E. Gall, & W. M. Cowan (Eds.),
Auditory function (pp. 623-645). New York: Wiley.
Heinbach, W. (1988). Aurally adequate signal representation: The
Part-Tone-Time-Pattern. Acustica, 67, 113-121.
Meddis, R., & Hewitt, M.J. (1991a). Virtual pitch and phase sensitivity of a
computer model of the auditory periphery. I: Pitch identification. Journal
of the Acoustical Society of America, 89, 2866-2882.
Meddis, R., & Hewitt, M.J. (1991b). Virtual pitch and phase sensitivity of a
computer model of the auditory periphery II: Phase sensitivity. Journal of
the Acoustical Society of America, 89, 2883-2894.
Moore, B.C.J. (1977). Effects of relative phase of the components on the
pitch of three-component complex tones. In E. F. Evans & J. P. Wilson
(Eds.), Psychophysics and physiology of hearing (2nd ed.) (pp. 349-362). New
York: Academic.
Patterson, R.D. (1973). The effects of relative phase and the number of
components on residue pitch. Journal of the Acoustical Society of America,
53, 1565-1572.
Patterson, R.D. (1987). A pulse ribbon model of monaural phase perception.
Journal of the Acoustical Society of America, 82, 1560-1586.
Summerfeld, Q., & Assmann, P. F. (1990). Perception of concurrent vowels:
Effects of harmonic misalignment and pitch-period asynchrony. Journal of the
Acoustical Society of America, 89, 1364-1377.
Terhardt, E. (1988). Psychoakustische Grundlagen der Beurteilung
musikalischer Kl nge. In J. Meyer (Ed.), Qualit tsaspekte bei
Musikinstrumenten (pp. 9-22). Celle: Moeck.
Terhardt, E. (1991). Music perception and sensory information acquisition:
Relationships and low-level analogies. Music Perception, 8, 217-240.
Terhardt, E. (1992). From speech to language: On auditory information
processing. In M. E. H. Schouten (Ed.), The auditory processing of speech
(p. 363-380). Berlin: Mouton de Gruyter.
Richard Parncutt
-------------------------------------
From: "PETER B.L. Meijer" <meijer@NATLAB.RESEARCH.PHILIPS.COM>
Subject: Re: effect of phase on pitch
February 5, 1998
Richard Parncutt wrote a very interesting discussion / essay
on "phase deafness", and seems to make a distinction between
artificial and natural sounds
> In an ecological approach, the existence of phase sensitivity in such
> stimuli (or such comparisons between stimuli) might be explained as=
follows.
> These stimuli (or stimulus comparisons) do not normally occur in the human
> environment. So the auditory system has not had a chance to'learn' (e.g.,
> through natural selection) to ignore the phase effects. As hard as the ear
> might 'try' to be phase deaf in the above cases, some phase sensitivity=
will
> always remain, for unavoidable physiological reasons.
I have a complex-sound generating application, so far based on
my assumption that phases may be neglected: phases are random.
Also, the sound components are normally not harmonic, so any
momentary phase relations will change over time. However, these
sounds, derived from spectrographic synthesis of environmental
images instead of spectrographic (re)synthesis of spectrograms,
definitely ``do not normally occur in the human environment,''
and involve both ``tens of ms'' bursts as well as sounds of
much longer duration. So, should or should I not have tried to
exploit phase sensitivity and enforce certain, e.g., short-term,
phase relations? Or should I hope (in vain?) that people can
"un-learn" to hear most of the (if any) phase effects? Any advice?
See
http://ourworld.compuserve.com/homepages/Peter_Meijer/winvoice.htm
for the video sonification application I refer to.
In other words, my question relates to how to optimize auditory
perception / resolution in complex information-carrying sounds,
and I wonder if I should "do something" with phases or not.
There is non-evolutionary survival value at stake here.
Best wishes,
Peter Meijer
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Bert Schouten <Bert.Schouten@LET.RUU.NL>
Subject: Re: effect of phase on pitch
Perceptual effects of phase on pitch or timbre could be epiphenomena of a
mechanism needed for sound localization. We need some form of phase-locking
in the auditory nerve in order to be able to compare the signals from the
two ears. In natural environments the ear receives no phase information
about the sound source, so pitch and timbre cannot normally be based on
temporal information, but the sensitivity to temporal differences between
the two ears may influence pitch or timbre whenever headphones are used or
when phones are inserted into animals' ear canals.
I agree, therefore, with Richard Parncutt's evaluation of the lack of
relevance of phase information for pitch and timbre, but I prefer an
epiphenomenal rather than a phylogenetic explanation for any residual
effects. I am looking askance now at Peter Cariani, with whom I have had
this argument before.
Bert
Bert Schouten (M.E.H.)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: LASZLO Toth <tothl@INF.U-SZEGED.HU>
X-To: auditory@vm1.mcgill.ca
To: Multiple recipients of list AUDITORY <AUDITORY@VM1.MCGILL.CA>
>From tothl Thu Feb 5 16:50:04 +0100 1998 remote from inf.u-szeged.hu
Date: Thu, 5 Feb 1998 16:50:04 +0100 (MET)
From: Toth Laszlo <tothl@inf.u-szeged.hu>
X-Sender: tothl@csilla
To: Multiple recipients of list AUDITORY <AUDITORY@VM1.MCGILL.CA>
Subject: Re: effect of phase on pitch
Message-ID: <Pine.SV4.3.91.980205162120.6492F-100000@csilla>
MIME-Version: 1.0
Received: from inf.u-szeged.hu by inf.u-szeged.hu; Thu, 5 Feb 1998 16:50=
MET
Content-Type: TEXT/PLAIN; charset=3DUS-ASCII
Content-Length: 1754
On Thu, 5 Feb 1998, R. Parncutt wrote:
> ...
> Straightforward evidence of the ear's insensitivity to phase in the sounds
> of the real human environment has been provided by Heinbach (1988). He
> reduced natural sounds including speech (with or without background noise
> and multiple speakers) and music to their spectral contours, which he=
called
> the part-tone-time-pattern. In the process, he completely discarded all
> phase information. The length of the spectrum analysis window was=
carefully
> tuned to that of the ear, which depends on frequency. Finally, he
> resynthesized the original sounds, using random or arbitrary phase
> relationships. The resynthesized sounds were perceptually=
indistinguishable
> from the originals, even though their phase relationships had been=
shuffled.
>
"Perceptually indistinguishable" means here only that their PITCHes were
perceptually indistingushable, am I right? Considering other aspects,
changing the phase relationships definitely has effects on sound quality.
In phase vocoders, for example, uncorrect decoding of phases results in
really annoying artifacts.
Toth Laszlo =20
e-mail: tothl@inf.u-szeged.hu =20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Peter Cariani <peter@epl.meei.harvard.edu>
Subject: Re: effect of phase on pitch
R. Parncutt wrote:
> Pondering the evolutionary origins of the ear's "phase deafness" in most
> naturally occurring sounds, I have come up with the following argument.=
Does
> it make sense? Is there other literature on this subject that I have=
missed?
I definitely agree that the auditory system is essentially
phase-deaf,except around the edges (which is why the edges are interesting).
However, where we would differ is I think that it is possible that the
phase-deafness of the system is a result of interspike interval analyses
and mechanisms that integrate/fuse invariant phase relationships into
unified objects, whereas you would hold that this system is phase deaf
because it uses rate-place representations. Is this fair?=20
>In everyday listening environments, phase relationships are typically=20
> jumbled unrecognizably when sound is reflected off environmental objects;
> that is, when reflected sounds of varying amplitudes (depending on the
> specific configuration and physical properties of the reflecting=
materials)
> are added onto sound traveling in a direct line from the source. Thus,=
phase
> information does not generally carry information that can reliably aid a
> listener in identifying sound sources in a reverberant environment
> (Terhardt, 1988; see also Terhardt, 1991, 1992).
Let's consider an echo off one surface that introduces a time delay. To the
extent that the echo's time pattern resembles that of the original
stimulus, depending upon the delay between the two the sound and its echo
can be fused into one object. In an ecological situation, sound reflecting
surfaces and
their properties are not changing rapidly. The phase structure of echoes
combined with the phase structure of the direct sound will then form an
invariant whole, so that if one has a mechanism for fusing together
repeated relative phase patterns, echoes become fused with the direct
signal (i.e. fusion is a dfferent strategy for "echo suppression"). At
short delays (<15 msec) one hears only one sound; at longer delays the
timbre of the one sound changes, and at really long delays one hears two
sounds. These differences
would be related to how the auditory system integrates recurrent patterns
with different delays.
In such situations, one would not generally be able to distinguish one
particular phase pattern from another, but it would be important that the
time structure of the signal and that of the echo be largely similar in
order for fusion to take place.
I don't think things get much more Gibsonian than this. If the auditory
system operates this way, then there is an invariant time pattern in the
sound environment that the sound and the echo share that is extracted by
the auditory system. One way to think about this is that the auditory
system brings the correlation structure of sound & echo into the nervous
system by means of phase-
locked discharges.
This phase-locking is seen in every sensory system, albeit on different
time scales, so stimulus-driven time structure has been around at least as
long as sensory receptors and sensory neurons. Essentially, if the fine
structure of the
stimulus is present in the timings of discharges, then it is possible to
carry out very, very general kinds of pattern recognition operations that
extract invariant time structure from what amounts to an analog, iconic
representation of the sound.=20
This is much closer to Gibsonian ideas concerning mechanisms of perception
than models based on spectral features (perceptual atoms) and complex
pattern recognitions.=20
>This is a matter of particular concern in an ecological approach, as=20
>non-reverberant environments are almost non-existent in the real world
>(anechoic rooms, mountain tops). On the other hand, again in real acoustic
>environments, spectral frequencies (that is, the frequencies of isolated
>components of complex sounds, or clear peaks in a running spectrum,
forming >frequency trajectories in time-varying sounds) cannot be directly
affected by
> reflection off, or transmission through, environmental obstacles. They=
might
> be indirectly affected as a byproduct of the effect that such=
manipulations
> can have on amplitudes (e.g., a weakly defined peak could be pushed=
sideways
> if amplitudes increased on one side and decreased on the other), but such
> phenomena could hardly affect audible sound spectra.
> So for the auditory system to reliably identify sound sources, it needs to
> ignore phase information, which is merely a constant distraction, and=
focus
> as far as possible on a signal's spectral frequencies (and to a lesser
> extent on the relative amplitudes of individual components, keeping in=
mind
> that these, too, are affected by reflection and transmission).
In a sense we are saying similar things here.=20
Interspike interval distributions, like rate-place profiles, are both
"phase-deaf" representations, and form analysis is based on such basic
"phase-deaf" representations.
>The ear's phase deafness with regard to pitch perception is thus a positive=
=20
> attribute. In fact, it may be regarded as an important phylogenetic
>achievement - the result of a long evolutionary process in which animals
whose >ears allowed phase relationships to interfere with the
identification of >dangerous or otherwise important sound sources died
before they could >reproduce. If this scenario is correct, then it is no
surprise that we are >highly sensitive to small changes in frequency, and
highly insensitive to phase >relationships within complex sounds.
Localization of sound is important, but it is no less important to be able
to recognize the forms of sounds, to be able to distinguish and recognize
different sound sources. The reason that we talk so much in terms of
localization is that we understand more of how localization mechanisms
operate: what are the cues, what are the neural computations. One could
make an analogous argument that it is important to be able to detect
arbitrary recurring sound patterns that come in at different times, and
that therefore basic mechanisms evolved that integrate similar time
patterns over many delays. Such mechanisms would be deaf to the particular
phases of sounds, but sensitive to transient changes in phase structure.
Birds and humans detect mistuned harmonics quite well. Why is this? The
harmonic complex has a constant phase structure that recurs from period to
period and the mistuned component has a constant phase structure that
recurs at its own unrelated period.=20
Phase relations between the complex and the mistuned component are
constantly changing. Two sounds are heard because invariant waveform/phase
patterns are fused together and varying sets of relations are separated.
Similar kinds of
considerations apply to double vowels with different F0's.=20
> Straightforward evidence of the ear's insensitivity to phase in the sounds
> of the real human environment has been provided by Heinbach (1988). He
> reduced natural sounds including speech (with or without background noise
> and multiple speakers) and music to their spectral contours, which he=
called
> the part-tone-time-pattern. In the process, he completely discarded all
> phase information. The length of the spectrum analysis window was=
carefully
> tuned to that of the ear, which depends on frequency. Finally, he
> resynthesized the original sounds, using random or arbitrary phase
> relationships. The resynthesized sounds were perceptually=
indistinguishable
> from the originals, even though their phase relationships had been=
shuffled.
Yes, but these sounds still had the same time-pattern within each freq.
channel and the relations of time-patterns across channels were presumably
stable over the course of the stimulus. If the interchannel phase relations
were constantly changing, I think the sound would not have the same
quality. If you introduced
many random delays at different timepoints into the different frequency
channels, I would think that these sounds would break apart.
I've experimented with sequences of vowel periods having different phase
relationships. One can take the waveform of a vowel period and flip its
polarity and/or reverse it in time. This results in 4 possible operations
for each vowel period. If you do this in an orderly, regular, repeating
way, the resulting waveform has a pitch corresponding to the recurrence
period of the whole pattern. If you randomize the sequences, the waveform
has a very noisy pitch and has a very different quality, and if you
introduce random time delays in between the vowel periods in addition to
the random phase switches, the pitch goes away.=20
Now the short-term spectral structure of this sound is constant, but the
time-relations between events in one vowel period and another have been
destroyed.
Voice pitches of vowels thus can be seen as the result of recurrent phase
patterns that span vowel periods. It is the delay between the patterns (the
recurrence time) that determines the pitch. If there are no recurrent phase
patterns there is no pitch. Recurrence time of phase (time interval)
defines frequency.=20
> It is nevertheless possible to create artificial stimuli for which clear,
> significant perceptual effects of phase relationships on perception can be
> demonstrated. For example, Patterson (1973, 1987) demonstrated that
> listeners can discriminate two harmonic complex tones on the basis of=
phase
> relationships alone.
I think that this discrim. was on the basis of a timbre difference. I agree
that phase relations can in some cases alter the relative influence of
particular harmonics and thereby influence timbre.
> Moore (1977) demonstrated that the relative phase of the components
affects >the pitch of harmonic complex tones consisting of three
components; for each >tone, there were several possible pitches, and
relative phase affected the >probability of a listener hearing one of those
as 'the' pitch.
These several possible pitches, I assume, were associated with partials
that could be heard rather than with F0.
Again phase structure can subtly alter the relative salience of particular
harmonics, and hence the partials that are best heard.
> Hartmann (1988) demonstrated that the audibility of a=20
> partial within a harmonic complex tone depends on its phase relationship
> with the other partials.
Yes.
> Meddis & Hewitt (1991b) succeeded in modeling these
> various phase effects, which (as Moore, 1977, explained) generally apply
> only to partials falling within a single critical band or auditory filter.
I think what happens is that relative phase can affect which harmonic is
most effective at creating discharges that are phase-locked to it.
> In an ecological approach, the existence of phase sensitivity in such
> stimuli (or such comparisons between stimuli) might be explained as=
follows.
> These stimuli (or stimulus comparisons) do not normally occur in the human
> environment. So the auditory system has not had a chance to'learn' (e.g.,
> through natural selection) to ignore the phase effects. As hard as the ear
> might 'try' to be phase deaf in the above cases, some phase sensitivity=
will
> always remain, for unavoidable physiological reasons.
But these effects are all extremely subtle. I don't think vowel quality
ever changes so radically that one hears a completely different vowel.
But why are there these kinds of subtle effects at all?
>From a rate-perspective, one could argue for some
kind of slight rate-suppression that depended on relative phases of
closely spaced harmonics. The interval account would be similar,
except that instead of rate suppression, one would have interval
or synchrony suppression.
> There could, however, be some survival value associated with the ability=
to
> use phase relationships to identify sound sources during the first few=
tens
> of ms of a sound, before the arrival of interference from reflected waves=
in
> typical sound environments. On this basis, we might expect phase
> relationships at least to affect timbre, even in familiar sounds.=
Supporting
> evidence for this idea in the case of synthesized musical instrument=
sounds
> has recently been provided by Dubnov & Rodet (1997). In the case of speech
> sounds, Summerfield & Assmann (1990) found that pitch-period asynchrony
> aided in the separation of concurrent vowels; however, the effect was
> greater for less familiar sounds (specifically, it was observed at
> fundamental frequencies of 50 Hz but not 100 Hz). In both cases, phase
> relationships affected timbre but not pitch.
> The model of Meddis & Hewitt (1991a) is capable of accounting for known
> phase dependencies in pitch perception (Meddis & Hewitt, 1991b). This=
raises
> the question: why might it be necessary or worthwhile to model something
> that does not have demonstrable survival value for humans (whereas music
> apparently does have survival value, as evidenced by the universality of
> music in human culture).
It's certainly premature to judge what kinds of auditory representations
have or don't have "demonstrable survival value for humans." Phase
dependencies may
be side issues in ecological terms, but they do shed light on basic
auditory mechanisms.=20
Deciding what is evolutionarily-relevant is difficult at best.
In arguing that music perception is culturally universal, therefore it must
have survival value, I think one commits an evolutionary fallacy, that
every capability is the result of a particular adaptation to a particular
ecological demand.=20
Even Steven Pinker doesn't go this far. At least he would say that music
perception could be a by-product of other adaptations.
It's very hard indeed to identify what the inherent survival value of music
would be. And there can be generalist evolutionary strategies and
general-purpose pattern recognizers, so that it is not always the case that
evolutionary demands and solutions have to be so parochial....... (most of
vision isn't face recognition, even if one thinks that face recognition is
a special purpose module selected for a special-purpose ecological demand
-- we see all sorts
of complex patterns that our evolutionary forebears never encountered. We
were not evolutionarily selected to read text such as this, but we can do
it because our visual mechanisms have sufficient generality that we can
learn to recognize letters and words).
I'd rather we avoid particularistic adaptive "just-so" stories to explain
away peculiarities of our senses.
However, studying music perception is very important even if music had/has
no inherent survival value for the species, because it gives us another
window on complex modes of auditory representation and processing. Music is
an important
aspect of auditory cognition, and your work on the structure of auditory
cognition is quite valuable regardless of whether music is essential to
survival.=20
Very general kinds of pattern recognition mechanisms are possible and could
very well be based on the nature of the basic auditory representations. For
example, if an all-order interval analysis is carried out by the central
auditory system, the harmonic relations (octaves, fifths, low-integer
frequency ratios) all fall out of the inherent harmonic structure of time
intervals and their interactions.
(I've read your book and know you don't like these kinds of Pythagorean
relations. But there they are.......)
Our perception of octave similarities would be the result of very basic
similarities in interval representations rather than the result of acquired
associations. According to this perspective, octave-similarities and
perception of missing fundamentals are the consequence of the operation of
phylogenetically-ancient neural coding systems.
We may be phase-deaf, but much of our auditory perception may be based on
phase-locking nonetheless.=20
-- Peter Cariani
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: "Steven M. Boker" <sboker@CALLIOPE.PSYCH.ND.EDU>
Subject: Re: effect of phase on pitch
X-To: AUDITORY@VM1.MCGILL.CA
To: Multiple recipients of list AUDITORY <AUDITORY@VM1.MCGILL.CA>
Bert Schouten <Bert.Schouten@LET.RUU.NL> writes:
>Perceptual effects of phase on pitch or timbre could be epiphenomena
>of a mechanism needed for sound localization. We need some form of
>phase-locking in the auditory nerve in order to be able to compare
>the signals from the two ears. In natural environments the ear
>receives no phase information about the sound source, so pitch and
>timbre cannot normally be based on temporal information, but the
>sensitivity to temporal differences between the two ears may
>influence pitch or timbre whenever headphones are used or when
>phones are inserted into animals' ear canals.
>
This argument seems almost right, although I'd add that phase
information is highly predictive of self-motion. Thus there may
be a strong localization component of phase both for objects and
for self-location within the frame of reference. Similarly, phase
change is correlated with object acceleration within an environment.
If an sound source starts to move toward the listener and there is
a reflecting wall behind the sound source, both the sound source
acceleration and the acceleration of the source relative to the
wall would be predictable from the phase changes of the direct and
the reflected sound.
There is some research (see Stoffrengen's recent paper for an
overview) into perception of self motion through auditory cues.
I argue that a large proportion of that information is contained
in phase. However, if we are to maintain Pitch Constancy (similar
to color constancy) for moving objects, phase changes must be
removed from the perception of pitch and relegated to motion
detection. There is some error in this process. The error can be
most readily seen in an environment where the subject is wearing
headphones, because then the phase changes are decoupled from the
other sensory modes of information about self-motion. It is partly
this multi-modal self motion information that allows the phase
information to be removed from the incoming auditory signal and
pitch constancy to be attained.
Cheers,
Steve
Steven M. Boker 219-631-4941 (office)
sboker@nd.edu 219-631-8883 (fax)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
From: Michael Kubovy <mk9y@virginia.edu>
Organization: University of Virginia, Department of Psychology
Subject: Re: effect of phase on pitch
Please note the following sorely neglected article, which shows that when=
one
hears a twelve-component uniform-amplitude harmonic complex in cosine phase=
in
which a single component has been shifted out of cosine phase, the singular
component segregates from the complex. At the same time, however, the pitch
of the fundamental does not seem to be affected. The article also shows
that if one
passes the waveform of such a complex through a square-root or cubic-root
compressive transformation, the spectrum of the resulting waveform has a=
peak
at the frequency of the singular component.
article{kubovy79,
author =3D {Kubovy, M and Jordan, R},
title =3D {Tone-segregation by phase: On the phase sensitivity of the
single ear}, journal =3D {Journal of the Acoustical Society of America},
volume =3D 66, number =3D 1, pages =3D {100--106}, year =3D =
1979,}
Some aspects of this work were followed up in:
@phdthesis{Daniel86,
author =3D {Jane Elizabeth Daniel},
title =3D {Detecting spectral and temporal differences in the harmonic
complex},
school =3D {Rutgers University, New Brunswick, NJ},
year =3D 1986,
note =3D {available from Rutgers's Library of Science and Medicine,
BF.D184 1986},}
Michael Kubovy, Professor of Psychology
Dept. of Psychology, Univ. of Virginia
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: B. Suresh Krishna
Thanks a lot !! For forwarding the responses and for asking the question .=
=20
Suresh
B. Suresh Krishna
Email: suresh@cns.nyu.edu=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Alex Galembo
Dear List,
I would appreciate anybody to direct me to the publications describing how
gradual increase of the brightness of the tone might increase errors in
pitch estimation or evoke pitch shift.
Once more, thanks a lot to all who participated in the discussion on phase
influence on pitch that was very useful for me and as I know for some other
listers.=20
Alex Galembo
=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: "James W. Beauchamp" <jwb@timbre.music.uiuc.edu>
To: galembo@psyc.queensu.ca
Subject: your info request
Alex,
...
I don't know of any publications which have conducted psychoacoustic tests
about brightness changing to a pitch change judgement. The results could
be highly dependent on individual subjects. For example, if you gradually
remove the even harmonics of a sawtooth tone, at what point does one=20
report the pitch to be an octave higher? This must have to do with=
thresholds.
But it also has to do with listener expectation.
...
Jim
=3D=3D=3D=3D=3D=3D=3D=3D
From: "R. Parncutt" <psa03@cc.keele.ac.uk>
Subject: Re: your mail
To: galembo@psyc.queensu.ca
Date: Fri, 13 Feb 1998 13:27:20 +0000 (GMT)
X-Mailer: ELM [version 2.4 PL23]
> I would appreciate anybody to direct me to the publications decribing how
> gradual increase of the brightness of the tone might increase errors in
> pitch estimation or evoke pitch shift.
Terhardt's model would predict the pitch shift but I'm not sure how
plausible the predictions would be...
Best wishes,
Richard Parncutt, Lecturer in Psychology of Music and Psychoacoustics,
Unit for the Study of Musical Skill and Development, Keele University.
Post: Dept of Psychology, Keele University, Staffordshire ST5 5BG, GB.
Tel: 01782 583392 (w),01782 719747 (h). Email: r.parncutt@keele.ac.uk.=20
Fax: +44 1782 583387. URL: http://www.keele.ac.uk/depts/ps/rpbiog.htm.
=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Punita Singh
I have run into this phenomenon several times -- where the spectral locus
of components in a complex (correlated with timbral "sharpness" or
"brightness") affects judgments of pitch.
For details re: pitch discrimination affected by changes in spectral locus,
see:
Singh, P.G. and Hirsh, I.J. (1992) "Influence of spectral locus
and F0 changes on the pitch and timbre of complex tones", J. Acoust.
Soc. Am., 92(5), 2650-2661.
For interactions between brightness and pitch observed in a sequential
grouping context, see:
Singh, P.G. (1987) "Perceptual organization of complex-tone
sequences: A tradeoff between pitch and timbre?"
Both these papers contain other relevant references as well.
Another reference which comes to mind, is a chapter by Hesse on judgment
of musical intervals in the book "Music, mind and brain" Manfred Clynes
(ed.), Plenum, 1983-- which showed some interaction between pitch judgments
and brightness.
My 1990 Ph. D. dissertation from Washington University, St. Louis, on
"Perceptual correlates of spectral changes in complex tones" also
contains several references on this topic (prior to 1990) !
Punita
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Punita Singh
From: Punita Singh <pgsingh@HOTMAIL.COM>
Subject: Re: Brightness affecting pitch
X-To: AUDITORY@VM1.MCGILL.CA
To: Multiple recipients of list AUDITORY <AUDITORY@VM1.MCGILL.CA>
OOOOPS! I wanted to pitch in, but my lack of brightness interfered ..
Here are the Miss Singh details for the second reference:
>
>For interactions between brightness and pitch observed in a sequential
grouping context, see:
> Singh, P.G. (1987) "Perceptual organization of complex-tone
>sequences: A tradeoff between pitch and timbre?"
J. Acoust. Soc. Am. 82(3), 886-899.
Re: the diss, - it can be borrowed from the library at Washington
University, St. Louis, via inter-lib loan, or can be ordered from UMI at
1-800-521-0600, order no. 9122399.
>"Perceptual correlates of spectral changes in complex tones" (1990)
--- Punita
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
From: Al Bregman <bregman@hebb.psych.mcgill.ca>
Subject: Re: your mail
X-To: Alexander Galembo <galembo@psyc.queensu.ca>
X-cc: Roger Shepard <ROGER@PSYCH.STANFORD.EDU>,
Carol Krumhansl <clk4@cornell.edu>,
John Pierce <JRP@CCRMA.STANFORD.EDU>,
Auditory list <auditory@vm1.mcgill.ca>
To: Multiple recipients of list AUDITORY <AUDITORY@VM1.MCGILL.CA>
Dear Alexander,
Let me reply to your query about anomalies in pitch perception by
sandwiching my replies in between parts of your text. I hope you will not
mind my sending a copy to the AUDITORY list for comments and to the
colleagues that I mention below:
According to many music psychologists (Shepard, Krumhansl) there are two
aspects to "pitch", namely "pitch height" and "chroma". "Pitch height"
refers to the up-down dimension of pitch (e.g., the difference between A440
and A880). "Chroma" refers to the circular dimension, in which pitches
repeat again in every octave. I think brightness might be related to pitch
height.
Galembo:
> ... However, the pitch height dimension is not determined well [enough]
> to measure it.
> The known unit is an octave, but what about fractions of the pitch-
> height octave?
Bregman:
Since pitch height is a psychological property, not a physical one, any
measurement would have to come out of a multidimensional scaling. Perhaps
the best people to talk to about this would be:
Roger Shepard <ROGER@PSYCH.STANFORD.EDU>,
Carol Krumhansl <clk4@cornell.edu>
Galembo:
> For example, I have a bass A1 tone with equal amplitudes and all-sine
> phases of 100 harmonics, and another tone having alternate sine-cosine
> phases of the odd and even harmonics correspondingly. When directly
> compared, the second tone is sounding for some subjects an octave higher
> than the first. (This result corresponds, to some extent, to the finding=
by
> Carlyon and Shackleton related to higher unresolvable harmonics of the
> middle range fundamentals, published in JASA 95, 3529 (1994)).
> But this "A1-A1 octave" interval is not "strong" and might be judged in
> other situations as just increased brightness.
> If to compare the second ("higher") tone with the real A2 tone, the
> interval will also be an octave, but a "stronger" octave, (under=
"STRONGER"
> I mean "more distinctive" in analogy with pitch strength - stronger=
pitches
> have to produce stronger intervals - is it right to say?).
> Then the sub-units of the octave in the pitch hight scale have to be
> the units of the "octave strength"? But what then to do with this "octave
> strength" if the phases manippulation make pitch of the tone becoming
> distinctive fifth of the fundamental?
> If I am understandable here, I would like to know your opinion.
Bregman:
I have often thought that the pitch height dimension could separate two
C's (for example) by more or by less than a conventional "octave,
depending on the spectrum of the tones. Musicians might not like this, but
it seems to agree with many phenomenological descriptions like yours.
Perhaps we should argue as follows: The octave is a musical concept, not a
perceptual one. We assume that an octave is defined by two tones that
have the same chroma, but different pitch heights. Typically, when the
spectra are similar (whatever that means), a 2:1 ratio of fundamental
frequencies gives rise to the difference in pitch height that we associate
with an octave. In such cases, (e.g., comparing notes on the same
instrument), we don't notice the contribution of the spectrum to pitch
height.
I think many people have noticed the difficulty with the identification of
a given fundamental with a definite pitch height, among them John Pierce
at CCRMA.
- Al
----------------------------------------------------------------------
Albert S. Bregman, Professor, Dept of Psychology, McGill University
1205 Docteur Penfield Avenue, Montreal, Quebec, Canada H3A 1B1.
Phone: +1 514-398-6103 Fax: -4896 Email: bregman@hebb.psych.mcgill.ca
Lab Web Page: http://www.psych.mcgill.ca/labs/auditory/laboratory.html
From: Bob Carlyon
Dear Al and Alex,
I emailed Alex with a reply to his question, which led to him citing CAr
lyon and Shackleton. Al's email made me realise that there is another paper
of mine which suggests a pitch-like dimension which may perceptually
resemble tone height: JASA vol 102 p1097-1105 (1997): "The effects of two
temporal cues on pitch judgements". In it I also cite an article by Roy
Patterson in which he varies stimuli continuously from one octave to the
next without changing chroma: Contemp. Music Rev 9, p69-81 (1993): What is
the octave of a harmonically rich note?
cheers
bob
--------------------------------------------------------------------
Date: Wed, 11 Mar 1998 17:49:40 -0500
From: repp@lenny.haskins.yale.edu
To: galembo@psyc.queensu.ca
Subject: Effects of spectral envelope on pitch
X-VMS-To: SMTP%"galembo@pavlov.psyc.queensu.ca"
Hi Sasha:
I read with interest Al Bregman's reply to your query. There are
probably many ways of demonstrating an influence of spectral envelope on
perceived pitch height, but the one I remember best is in a recent paper
of mine:=20
Repp, B. H. (1997). Spectral envelope and context effects in the tritone
paradox. Perception, 26, 645-665.
I show there that Shepard tones, which supposedly have a constant pitch
height and vary in chroma only, do vary in perceived pitch height.
This seems to be due to the fact that the shape of their discrete spectral
envelope varies, even though all envelopes are designed to fit under the=20
same continuous envelope function.
Best,
Bruno
cc: Al Bregman
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Alexander Galembo, Ph. D.
NSERC-NATO Science fellow
Acoustics lab, Dept. of Psychology, Queen's University
Kingston ON K7L3N6=20
Canada
Tel. (613) 5456000, ext. 5754
Fax (613) 5452499
E-mail: galembo@pavlov.psyc.queensu.ca
URL : http://www.geocities.com/CapeCanaveral/Lab/8779/