Re: Sources & terms (Erick Gallun )

Subject: Re: Sources & terms From: Erick Gallun <gallun(at)BU.EDU> Date: Tue, 4 May 2004 12:15:03 -0400 Dear list members, The issue of auditory streaming is very interesting to me and is playing an increasingly larger role in my research. However, the three examples below and Dr. Bregman's excellent points about methods demonstrate a fundamental problem with the concept. The best way to describe the problem is by analogy to the visual world, where the concepts of objects and features have (I believe) more agreed-upon meanings. In both vision and audition, the physical signals are the result of events or objects that the perceiver is trying to reconstruct. The "stream" or the "object" is a mental construct of the perceiver. I will use the examples given to extend my point: >(1) An orchestral recording played through one loudspeaker. There is >one "sound" source, but the signal is complex. This is possibly the >most difficult to deal with. This is a "stream" in the way that a photograph is a visual object. Based on the listener's attentional state and voluntary allocations of processing power it can stay a single object or be broken down into components - some of which can become smaller objects. The poor resolution of the presentation mechanism increases the difficulty of interpreting the picture as multiple objects reflecting light in space or the recording as multiple objects creating sound across time. >(2) An orchestral recording in stereo (X/Y mic technique). There are >two sound sources and the "ear" (?) / "mind" (?) is assisted in >decoding by (apparent) spatialization of (apparent) sources. Complex. The analogy here is to a picture (or perhaps a movie) rendered in as simulated 3D. The entire recording is an object being presented to the perceiver, but the increased resolution allows the perceiver to overcome informational and energetic masking (assuming stereo playback) to a greater degree. >(3) An orchestral recording recorded onto a 60 channel medium and >played back through 60 loudspeakers. At the 'individual level', the >least complex in some ways. This is like the amusement ride where a movie is back-projected onto the walls of a circular room. By separating the sources in space, informational and energetic masking are reduced still further. Nonetheless, the 'streams' coming out of the sixty speakers are still nothing more than source material for the listener to use to recreate an idea of what the actual physical events and objects were that created the sounds. The original question was "How many concurrent streams can be perceived simultaneously?". I think the visual analogy is again useful. How many visual objects can be perceived? Obviously, the answer depends on how you test it and whether or not memory counts as part of simultaneous perception. For a sound (or light) input created by the combination of multiple sources, the ability of a perceiver to process it will depend heavily on the amount of time allocated and the task they must perform. An excellent paper on visual attention (Shiffrin et al., 1976) showed that observers could detect a brief visual target in any one of 49 spatial positions when they were not cued which position to watch as when they were told which one it might appear in. A letter identification task reduced the number of positions to 9. Is this what we mean by perceiving a stream or an object? In general, I think that stream segregation is an excellent description of how humans make sense of sound input, but for answering scientific questions we must be very precise about exactly what we mean. Erick Gallun Postdoctoral Researcher Hearing Research Center Boston University

This message came from the mail archive
http://www.auditory.org/postings/2004/
maintained by:

DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University