Ph.D. thesis on computational audition available (Paris Smaragdis )

Subject: Ph.D. thesis on computational audition available From: Paris Smaragdis <paris(at)MEDIA.MIT.EDU> Date: Sat, 5 May 2001 20:33:46 -0400 Dear friends, you might find the following thesis interesting. Redundancy Reduction for Computational Audition, a Unifying Approach. Paris Smaragdis, Massachusetts Institute of Technology, Media Laboratory, May 2001. Abstract Computational audition has always been a subject of multiple theories. Unfortunately very few place audition in the grander scheme of perception, and even fewer facilitate formal and robust definitions as well as efficient implementations. In our work we set forth to address these issues. We present mathematical principles that unify the objectives of lower level listening functions, in an attempt to formulate a global and plausible theory of computational audition. Using tools to perform redundancy reduction, and adhering to theories of its incorporation in a perceptual framework, we pursue results that support our approach. Our experiments focus on three major auditory functions, preprocessing, grouping and scene analysis. For auditory preprocessing, we prove that it is possible to evolve coclear-like filters by adaptation to natural sounds. Following that and using the same principles as in preprocessing, we present a treatment that collapses the heuristic set of the gestalt auditory grouping rules, down to one efficient and formal rule. We succesfully apply the same elements once again to form an auditory scene analysis foundation, capable of detection, autonomous feature extraction, and separation of sources in real-world complex scenes. Our treatment was designed in such a manner so as to be independent of parameter estimations and data representations specific to the auditory domain. Some of our experiments have been replicated in other domains of perception, providing equally satisfying results, and a potential for defining global ground rules for computational perception, even outside the realm of our five senses. The documents and some of the media examples are to be found at: http://sound.media.mit.edu/~paris/phd Best regards, Paris

This message came from the mail archive
http://www.auditory.org/postings/2001/
maintained by:

DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University