Also, in the late 1930s Hans Wallach gave a geometrical argument for
the use of motion in localization: Say that you hear a sound and can
localize it using iteraural timing and level differences to the median
plane. If you turn your head to the left, a sound source in front of
you will move towards your right ear, while one behind you will move
towards your left ear.
Furthermore, if the sound source is elevated above or below the
horizontal plane, it will move less than if it were in the plane. In
the extreme, when the sound source is directly above you its cues
don't change at all as you rotate your head.
See e.g. Wallach, H. "On Sound Localization." JASA 10(1), 1938, p83
-Mike
On Sun, Dec 18, 2005 at 01:10:40PM +0100, Christian Kaernbach wrote:
Dear Al,
This is an interesting question. I know of no work directly
addressing head movements and auditory scene analysis. The role of
head movements for sound source localization has certainly been well
studied quite some time ago (see postscript). However, whether head
movements are only relevant to localization, or whether they help to
separate sound sources, would be an interesting field of research. I
would reckon that in a typical cocktail party situation the listener
would move the head until he/she found the optimal SNR between the
desired signal and the rest of the sound field. Once this position
found, it should not be helpful to move the head further. Sure, it
would make the desired sound source a moving target, but with all the
other sound sources moving around in the same way. The first approach
should be to monitor what listeners actually do in difficult auditory
scenes. I could imagine that in case of repetitions (the important
phrase comes twice, e.g. because the speaker realized that it did not
come through) the listener might be inclined (sic!) to try a
different head position, so as to reduce redundancy between the two
communications.
Best regards,
Christian
PS: Let me write on the role of head movements for sound source
localization (SSL) in a postscript, a) because I am not a real expert
on this issue, and b) because many listers might know plenty about
it. I could not tell from where I have this knowledge, most probably
from oral communication early in my career. The two primary cues for
SSL are intensity differences and delay differences. These two cues
are, however, quite ambiguous: all sound sources on the famous "cone
of confusion" induce the same delay and intensity difference. Think
of zero delay and intensity difference: this is true for all sound
sources on the median plane, i.e. from ahead, top, behind, below,
etc. Nevertheless it has early been noted that humans can well
discriminate between sound sources from ahead and from behind. I was
told the anecdote that in the early days of SSL research this
performance was attributed to (from today's viewpoint) weird supposed
mechanisms, such as a sound pressure sensitivity of the chest. Later
on one started to use head fixation, and much of the ahead/behind
discrimination performance went away. Another mechanism involved in
this performance is the spectral filtering by the outer ear (head
related transfer functions, HRTF), but this mechanism can only be
helpful if the sound (or its supposed spectrum) is known to the
listener. So if using sine tones of varying loudness, the
ahead/behind discrimination depends critically on the participant's
ability to move his/her head. I suppose that much of this can be
found in Jens Blauert's book "Spatial Hearing". ... Note that head
movements for the improvement of SSL are quite a nice example of the
role of action in perception, up to the point where some say
"Perception is a behavior, a specific kind of action aiming at the
driving home of a maximum amount of information on the object of
interest." (Is this Gibsonian?)
--
Christian Kaernbach
Institut für Psychologie
Karl-Franzens-Universität Graz
8010 Graz
Austria
www.kaernbach.de fechner.uni-graz.at