Dear John and List: I appreciate your creative contribution to CASA and to the discussion of this list. Your method seems like an excellent way of achieving your goal, which you describe in your essay as, "Our design problem thus begins with finding a way to use interaural time difference as the primary mode for locating acoustic sources." If I understand it correctly, your method provides an accurate way of calculating differences in the point of spatial origin of the sounds, rejecting reflections, in order to reconstruct them individually. It is an excellent beginning. The reason I use the word "beginning" is that for humans, and presumably other animals, the use of spatial position is only one of many ways of solving the scene analysis problem in audition. This becomes clear when you observe that sounds can be segregated even when they come around corners, or are heard on a monophonic radio or by a unilaterally deaf person. I suspect that to replicate the full range of human auditory scene analysis (ASA), the attempt to solve the problem computationally (CASA) will have to use the same range of environmental cues. Apart from spatial origin, the following sorts of information are used by humans: (A) For integrating components that arrive overlapped in time: 1. harmonic relations 2. asynchrony of onset and offset 3. spectral separation 4. Independence of amplitude changes in different parts of the spectrum (B) For integrating components over time: 5. Spectral separation 6. Separation in time (interacts with other factors) 7. Differences in spectral shape 8. Differences in intensity (a weak effect) 9. Abruptness/smoothness of transition from one sound to the next (I have attached a 2-page summary of what is known about ASA in humans. As well as mentioning factors 1 to 9, it describes the effects of ASA on the experience of the listener. I have used it as a handout in talks I have given. It is in RTF format which should be readable by most versions of Word.) I'm not sure whether your rejection of the Fourier method extends to all methods of decomposing the input into spectral components. However if it does, we should bear in mind that factors 3, 4, and 5, 7, and probably 1, listed above, are most naturally stated on a frequency x time representation -- that is, on a spectrogram or something like it. Furthermore, when you look at a spectrographic representation of an auditory signal, the visual grouping that occurs is often directly analogous to the auditory organization (provided that the time and frequency axes are properly scaled). Why would this be so if some sort of frequency axis were not central to auditory perception, playing a role analogous to a spatial dimension in vision? Perhaps the Fourier transform is not the best approach to forming this frequency dimension, but something that does a similar job is required. Finally there is overwhelming physiological evidence that the human nervous system does a frequency analysis of the sound and retains separate frequency representations all the way to the brain. I understand that your goal is not necessarily to separate signals the way people do, but the long-term goal of CASA should be to reproduce the full range of accomplishments of human ASA. Perhaps I have missed some of the consequences of your method. If so I would be happy to be corrected. Best wishes, Al ------------------------------------------------- Albert S. Bregman, Emeritus Professor Dept of Psychology, McGill University 1205 Docteur Penfield Avenue Montreal, QC, Canada H3A 1B1 Office: Phone: +1 (514) 398-6103 Fax: +1 (514) 398-4896 Home: Phone & Fax: +1 (514) 484-2592 Email: bregman@hebb.psych.mcgill.ca ------------------------------------------------- ----- Original Message ----- From: John K. Bates <jkbates@COMPUTER.NET> To: <AUDITORY@LISTS.MCGILL.CA> Sent: 29-Jan-01 1:46 PM Subject: CASA problems and solutions > Dear List, > You may have noticed that contributions on CASA have dropped to near > zero after the enthusiasm of the early 1990s. A few people have suggested ... > John Bates > Time/Space Systems > Pleasantville, New York >
Attachment:
Bregman handout.rtf
Description: MS-Word document