Re: Computational ASA -- how many sources can humans perceive? (Valeriy Shafiro )


Subject: Re: Computational ASA -- how many sources can humans perceive?
From:    Valeriy Shafiro  <Valeriy_Shafiro(at)RUSH.EDU>
Date:    Fri, 30 Apr 2004 14:26:01 -0500

From: "Maher, Rob" rmaher(at)ECE.MONTANA.EDU >It is sometimes argued that "humans can do separation, so the problem must >be soluble." I would argue that humans do source _identification and >tracking_ very effectively, but perhaps humans do not actually solve the >computational _separation_ problem, in the sense that the individual vectors >'B', 'C', etc. are extracted in a neural signal processing context. I would like to ask a further question: Do we, in fact, know how many independent sound sources in a mixture humans can perceive? Thus far I know of only one research report where human listeners were asked to identify sound sources in a recorded "real-world" sound mixture (Ellis, D. P. (1996). Prediction-driven computational auditory scene analysis). We have been talking about this issue with Brian Gygi, and from the few related reports that Brian found, it appears that humans may not be that good in simultaneous perceiving independent sound sources. For instance, Jennifer Tufts and Tom Frank J. Acoust. Soc. Am. 101 , 3107 (1997) found that the accuracy of judging the number of talkers in a multitalker mixture drops considerably when there are more than 3 talkers. There is also a report by David Huron (Music Perception, Vol. 19, No. 1 (2001) pp. 1-64., or on-line http://www.music-cog.ohio-state.edu/Huron/Publications/huron.voice.leading.html ) that estimating the number of musical lines in polyphonic music worsens considerably after 3. Some anecdotal evidence for this limit also comes from movie sound effect designers. This is a citation from Walter Murch, a renown sound effect artist: "There is a rule of thumb I use which is never to give the audience more than two-and-a-half things to think about aurally at any one moment. Now, those moments can shift very quickly, but if you take a five-second section of sound and feed the audience more than two-and-a-half conceptual lines at the same time, they can't really separate them out. There's just no way to do it, and everything becomes self-canceling." (cited from http://www.filmsound.org/murch/waltermurch.htm) Any thoughts, comments, and references relevant to this issue are appreciated. ------------------------------------------------------------- Valeriy Shafiro Communication Disorders and Sciences Rush University Medical Center Chicago, IL office (312) 942 - 3298 lab (312) 942 - 3316 email: valeriy_shafiro(at)rush.edu


This message came from the mail archive
http://www.auditory.org/postings/2004/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University