[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Summary of responses to my previous enquriy.



Hello,

I have already posted an enquiry about the topic of computational
auditory model and received several responses. Here, I am pleased
to post these reponses.

By the way, I would like to thank all people who have responsed me
again.

Ke Chen




---------------------------------------
Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA05572
  (5.67b8/IDA-1.5 for chenke); Tue, 4 Apr 1995 02:00:52 +0800
Received: from alink-gw.apple.com by pkuns.PKU.EDU.CN with SMTP id AA14629
  (5.67b/IDA-1.5 for chenke@pccms.pku.edu.cn); Tue, 4 Apr 1995 02:04:16 +0800
Received: from federal-excess.apple.com by alink-gw.apple.com with SMTP
 (921113.SGI.UNSUPPORTED_PROTOTYPE/7-Oct-1993-eef)
        id AA15249; Mon, 3 Apr 95 11:04:41 -0700
        for chenke@pku.edu.cn
Received: from taurus.apple.com by federal-excess.apple.com (5.0/1-Nov-1994-eef)
        id AA23069; Mon, 3 Apr 1995 11:03:13 +0800
        for chenke@pku.edu.cn
Received: from [17.255.8.25] (dlyon1.atg.apple.com [17.255.8.25]) by
 taurus.apple.com (8.6.10/8.6.5) with SMTP id LAA20750; Mon, 3 Apr 1995 11:04:36
 -0700
Date: Mon, 3 Apr 1995 11:04:36 -0700
X-Sender: lyon@taurus.apple.com
Message-Id: <v02110106aba574cad445@[17.255.8.25]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
To: Chen Ke <chenke@pku.edu.cn>
From: lyon@apple.com (Richard Lyon)
Subject: Re: Enquiry about the work on computational auditory model.
Cc: AUDITORY@vm1.mcgill.ca
Content-Length: 1903
Status: RO

>To my best knowledge, almost all of work in this field focuses in the
>peripheral auditory model. Little work on the central auditory, auditory
>path and auditory cortex has already been reported. As a result, I want
>to investigate aforementioned work and conduct my research. I would appreciate
>it if anyone could give me pointers.

Dr. Ke Chen,

It's true that the modeling work gets thinner, harder to find, and harder
to understand, assess, and apply, as you move more centrally into the
auditory system.  But there is a substantial body of work out there if you
look hard enough.

Of course, it depends on what you mean by central, too.  Are pitch and
binaural mechanisms central?  The auditory pathway has many levels, and
these are probably more peripheral than central, but are not as peripheral
and cochlea and cochlear nucleus.  The correlation models of Licklider
(1951, for pitch) and Jeffress (1948, for binaural) have spawned a lot
of work in the last decade, my own included.  Neurophysiologists have
confirmed the existence of binaural cross-correlation circuits (e.g. TC
Yin in cats, Konishi Knudsen Sullivan in barn owls) and of delay-tuned
correlators for pitch-like operations (N. Suga in bats).

Knudsen's and Konishi's groups continue to do lots of studies and models
of primarily spatial processing through I.C. and tectum, including learning.
Cortical modeling is in a more primitive state, but some attempts are
being made (e.g. by Shamma) to understand and model the physiology.
The are numerous other groups active in auditory physiology and modeling,
and I apologize for not having time to give a more balanced account.

Let us know what you intend to do with modeling, and maybe we can make
more specific suggestions.  Do you have an application in mind, or a
particular level you want to model?

\Dick Lyon (408)974-4245
 Apple/ATG/InteractiveMedia/PerceptionSystems

-----------------------------------------
Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA05441
  (5.67b8/IDA-1.5 for chenke); Mon, 3 Apr 1995 23:10:50 +0800
Received: from cornell.edu by pkuns.PKU.EDU.CN with SMTP id AA14454
  (5.67b/IDA-1.5 for chenke@pccms.pku.edu.cn); Mon, 3 Apr 1995 23:14:18 +0800
Received: from blue.ornith.cornell.edu (BLUE.ORNITH.CORNELL.EDU
 [132.236.164.11]) by cornell.edu (8.6.9/8.6.9) with SMTP id LAA12827 for
 <chenke@PKU.EDU.CN>; Mon, 3 Apr 1995 11:14:58 -0400
Received: from minke by blue.ornith.cornell.edu (4.1/SMI-4.1)
        id AA03247; Mon, 3 Apr 95 11:14:57 EDT
Date: Mon, 3 Apr 95 11:14:57 EDT
Message-Id: <9504031514.AA03247@blue.ornith.cornell.edu>
Received: by minke (4.1/SMI-4.1)
        id AA00428; Mon, 3 Apr 95 11:14:56 EDT
To: chenke@pku.edu.cn
In-Reply-To: <199504030420.AAA02619@cornell.edu> (message from Chen Ke on Mon, 3
 Apr 1995 12:05:32 +0800)
From: "Dave Mellinger" <dave@ornith.cornell.edu>
Sender: dave%blue@cornell.edu
Subject: Re: Enquiry about the work on computational auditory model.
Reply-To: dave@ornith.cornell.edu
Status: RO

Below are references for three Ph.D. theses (one of them mine) you
might want to look at.  They all have computational models which
include higher parts of the auditory system.  There is an active group
at the University of Sheffield in Britain working in this area (Cooke
and Brown, below, were associated with it).

Also, I'd suggest contacting DeLiang Wang at Ohio State University
(+1-614-292-2911), as he is currently working on neurocomputational
models of auditory processing.  Try also Dan Ellis at the MIT Media
Lab (dpwe@media.mit.edu), who's doing some interesting work.

=======================================================================
David K. Mellinger, Postdoctoral Research Associate
Bioacoustics Research Program            email  dave@ornith.cornell.edu
Cornell Laboratory of Ornithology        phone  +1-607-254-2431
159 Sapsucker Woods Road                 fax    +1-607-254-2415
Ithaca, NY  14850-1999  USA
=======================================================================

@phdthesis{cooke:thesis,
  author        = "Martin Peter Cooke",
  title         = ""Modelling Auditory Processing and Organisation",
  school        = "University of Sheffield",
  year          = 1991,
  month         = may,
}

@phdthesis{brown:thesis,
  author        = "Guy Jason Brown",
  title         = "Computational Auditory Scene Analysis",
  school        = "University of Sheffield",
  year          = 1992,
  note          = "published as Department of Computer Science Rept. CS-92-22",
}

@phdthesis{mellinger:thesis,
  author        = "David K. Mellinger",
  title         = "Event Formation and Separation in Musical Sound",
  school        = "Stanford University",
  year          = 1991,
  address       = "Stanford, CA  94305",
}
-------------------------
Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA05258
  (5.67b8/IDA-1.5 for chenke); Mon, 3 Apr 1995 21:17:26 +0800
Received: from sun0.aic.nrl.navy.mil by pkuns.PKU.EDU.CN with SMTP id AA14351
  (5.67b/IDA-1.5 for chenke@pccms.pku.edu.cn); Mon, 3 Apr 1995 21:20:51 +0800
Received: from sun35.aic.nrl.navy.mil by Sun0.AIC.NRL.Navy.Mil (4.1/SMI-4.0)
        id AA29723; Mon, 3 Apr 95 09:21:22 EDT
Received: by sun35.aic.nrl.navy.mil; Mon, 3 Apr 95 09:21:22 EDT
Date: Mon, 3 Apr 95 09:21:22 EDT
From: ballas@AIC.NRL.Navy.Mil
Message-Id: <9504031321.AA09189@sun35.aic.nrl.navy.mil>
To: chenke@pku.edu.cn
Subject: Re: Enquiry about the work on computational auditory model.
Status: RO

There were a series of papers published on computational
approaches to sound Interpretation in the following book:
Natural Computation edited by Whitman Richards, Cambridge,
MA: the MIT Press, 1988.
I have published a series of papers on how well people do
in identifying brief everyday sounds, and examined a series
of factors.  This work might provide guideance on what would
be importatn in a computational model.
jim
--------------------
Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA05284
  (5.67b8/IDA-1.5 for chenke); Mon, 3 Apr 1995 21:19:53 +0800
Received: from vulcan.le.ac.uk by pkuns.PKU.EDU.CN with SMTP id AA14359
  (5.67b/IDA-1.5 for chenke@pccms.pku.edu.cn); Mon, 3 Apr 1995 21:23:03 +0800
Received: from violet.le.ac.uk by vulcan with SMTP (PP);
          Mon, 3 Apr 1995 14:21:23 +0100
Received: from VIOLET/MAILQUEUE by violet.le.ac.uk (Mercury 1.13);
          Mon, 3 Apr 95 14:21:07 +0100 (BST)
Received: from MAILQUEUE by VIOLET (Mercury 1.13);
          Mon, 3 Apr 95 14:10:24 +0100 (BST)
From: "Kien Seng, Wong" <ksw2@leicester.ac.uk>
To: chenke@pku.edu.cn
Date:          Mon, 3 Apr 1995 14:10:15 +0100 (BST)
Subject:       Re: Enquiry about the work on computational auditory model.
Priority: normal
X-Mailer: Pegasus Mail v3.22
Message-Id: <1347C2B24F8@violet.le.ac.uk>
Status: RO

Hello there,
    I read your message on the news server. I am currently trying to
model the auditory nervous system (ANS) also, starting with the VCN
cells. I have looked at a few types of cells in the VCN and also SOC
in the past few months.
    It all depends on with area of the sound perception you want to
model. I you should know that the SOC is mainly recognised by many as
the beginning stages of sound localisation processing. However, I
must caution you that the SOC of humans is rather different from
mammals. The inferior colliculus is still not very much investigated
on so there seems to ne little data in that area at the moment.
    My current interests are the onset-c units in the VCN. Many have
suspected that they are used as pitch processors.... A guess anyway...
I recommend you read papers by Young E.D, Sachs, Alan Palmer and Ray
Meddis for some recent findings on the ANS.
    I have been trying to get some nice papers recently but nothing
interesting. I will try to inform you if I come across anything.
Please let me know also if you have anything interesting.

Thanks
Kien Seng Wong
BTSP: Speech and Hearing Section
Engineering Department
University of Leicester

E-Mail : KSW2@LE.AC.UK
-----------------
Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA09247
  (5.67b8/IDA-1.5 for chenke); Wed, 5 Apr 1995 00:59:11 +0800
Received: from alink-gw.apple.com by pkuns.PKU.EDU.CN with SMTP id AA15771
  (5.67b/IDA-1.5 for chenke@pccms.pku.edu.cn); Wed, 5 Apr 1995 01:02:35 +0800
Received: from federal-excess.apple.com by alink-gw.apple.com with SMTP
 (921113.SGI.UNSUPPORTED_PROTOTYPE/7-Oct-1993-eef)
        id AA18561; Tue, 4 Apr 95 10:02:53 -0700
        for chenke@pku.edu.cn
Received: from taurus.apple.com by federal-excess.apple.com (5.0/1-Nov-1994-eef)
        id AA02258; Tue, 4 Apr 1995 10:01:26 +0800
        for chenke@pku.edu.cn
Received: from [17.255.8.25] (dlyon1.atg.apple.com [17.255.8.25]) by
 taurus.apple.com (8.6.10/8.6.5) with SMTP id KAA10505 for <chenke@pku.edu.cn>;
 Tue, 4 Apr 1995 10:02:51 -0700
Date: Tue, 4 Apr 1995 10:02:51 -0700
X-Sender: lyon@taurus.apple.com
Message-Id: <v0211010aaba6c4efb528@[17.255.8.25]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
To: Chen Ke <chenke@pku.edu.cn>
From: lyon@apple.com (Richard Lyon)
Subject: Re: Enquiry about the work on computational auditory model.
Content-Length: 4623
Status: RO

> In particular, we're going to apply this computational
>model to speaker recognition.

Ke,

In that case maybe you don't care about binaural and spatial aspects
of the auditory system, which is a lot of what the auditory system is
about, and where a lot of the best work has been on modeling it.

Pitch, on the other hand, is almost certainly an important cue for
speaker recognition, and this is an area where I have done some work.

I don't have much insight to offer on how auditory models are likely
to apply to speaker recognition more generally.

\Dick Lyon (408)974-4245
 Apple/ATG/InteractiveMedia/PerceptionSystems


Here are a selection of my relevant publications.  Let me know if
want copies of any that are hard to find locally.

Malcolm Slaney, Daniel Naar, and Richard F. Lyon, "Auditory Model
Inversion for Sound Separation," Proceedings IEEE International
Conference on Acoustics, Speech, and Signal Processing, Adelaide,
April 1994.

Richard F. Lyon, "Cost, Power, and Parallelism in Speech Signal
Processing," Proc. IEEE 1993 Custom Integrated Circuits Conference,
pp. 15.1.1-15.1.9, San Diego, CA, May 9-12, 1993.

Malcolm Slaney and Richard F. Lyon, "On the Importance of Time--
A Temporal Representation of Sound," chapter 5 in Visual Representations
of Speech Signals, M. Cooke and Steve Beet (eds.), John Wiley & Sons
Ltd., 1992.

Lloyd Watts, Doug Kerns, Richard Lyon, and Carver Mead, "Improved
Implementation of the Silicon Cochlea," IEEE J. Solid State Circuits
27(5) pp.692-700, May 1992.

Clive Summerfield and Richard Lyon, "ASIC Implementation of the Lyon
Cochlea Model," Proceedings IEEE International Conference on
Acoustics, Speech, and Signal Processing, San Francisco, March 1992.

Malcolm Slaney and Richard F. Lyon, "Visualizing Sound with Auditory
Correlograms," DRAFT submitted to JASA 1991; unfinished.

Malcolm Slaney and Richard F. Lyon, "Apple Hearing Demo Reel," Apple
Technical Report #25, Apple Computer, Inc., Cupertino, 1991.

Richard F. Lyon, "Automatic Gain Control in Cochlear Mechanics",
The Mechanics and Biophysics of Hearing, P. Dallos et al., eds.,
Springer-Verlag, 1990.

Malcolm Slaney and Richard Lyon, "A Perceptual Pitch Detector,"
Proceedings IEEE International Conference on Acoustics, Speech, and
Signal Processing, Albuquerque, April 1990.

Richard O. Duda, Richard F. Lyon, and Malcolm Slaney, "Correlograms and
the Separation of Sound," 24th Asilomar Conference on Signals, Systems
and Computers, IEEE, Maple Press, 1990.

Richard F. Lyon and Carver Mead, "Cochlear Hydrodynamics Demystified",
Caltech Computer Science Technical Report Caltech-CS-TR-88-4, 1989.

Richard F. Lyon and Carver Mead, "Electronic Cochlea", Ch. 16 in
Analog VLSI and Neural Systems, Carver Mead, Addison Wesley, 1989.

Richard F. Lyon and Carver A. Mead, "An Analog Electronic Cochlea"
IEEE Trans. ASSP. 36(7), July 1988.

Richard F. Lyon and Carver A. Mead, "A CMOS VLSI Cochlea," Proceedings
IEEE International Conference on Acoustics, Speech, and Signal
Processing, New York, April 1988.

Richard F. Lyon and Eric P. Loeb, "Experiments in Isolated Digit
Recognition with a Cochlear Model--An Update", Proceedings Speech
Recognition Workshop, DARPA, San Diego, March 1987.

Richard F. Lyon, "Speech Recognition in Scale Space", Proceedings IEEE
International Conference on Acoustics, Speech, and Signal Processing,
Dallas, 1987.

Richard F. Lyon, "Speech Recognition Experiments with a Cochlear Model",
Proceedings, DARPA Speech Recognition Workshop, Palo Alto, Feb. 1986,
and shorter version in Proceedings of Montreal Symposium on Speech
Recognition, McGill Univ., July, 1986.

Richard F. Lyon and Lounette Dyer, "Experiments with a Computational
Model of the Cochlea", Proceedings IEEE International Conference on
Acoustics, Speech, and Signal Processing, Tokyo, 1986.

Richard F. Lyon and Niels Lauritzen, "Processing Speech with the
Multi-Serial Signal Processor", Proceedings IEEE International
Conference on Acoustics, Speech, and Signal Processing, Tampa, March,
1985.

Richard F. Lyon, "Computational Models of Neural Auditory Processing",
Proceedings IEEE International Conference on Acoustics, Speech, and
Signal Processing, San Diego, March, 1984.

Richard F. Lyon, "A Computational Model of Binaural Localization and
Separation", Proceedings IEEE International Conference on Acoustics,
Speech, and Signal Processing, Boston, April 1983.

Richard F. Lyon, "A Computational Model of Filtering, Detection, and
Compression in the Cochlea", Proceedings IEEE International Conference
on Acoustics, Speech, and Signal Processing, Paris, May 1982.

-------------------------
Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA04713
  (5.67b8/IDA-1.5 for chenke); Fri, 7 Apr 1995 22:37:41 +0800
Received: from media.mit.edu (media-lab.media.mit.edu) by pkuns.PKU.EDU.CN with
 SMTP id AA19325
  (5.67b/IDA-1.5 for chenke@pccms.pku.edu.cn); Fri, 7 Apr 1995 22:41:27 +0800
Received: by media.mit.edu (5.57/DA1.0.4.amt)
        id AA06918; Fri, 7 Apr 95 10:41:15 -0400
Message-Id: <9504071441.AA06918@media.mit.edu>
To: Chen Ke <chenke@pku.edu.cn>
Subject: Re: Enquiry about your work in auditory model.
In-Reply-To: Your message of "Fri, 07 Apr 1995 17:01:10 +0800."
             <199504070901.AA04051@pccms.pku.edu.cn>
X-Phys-Location: 1039 Mass Ave #8A, Cambridge MA 02138
Date: Fri, 07 Apr 1995 10:41:15 -0400
From: "Dan Ellis" <dpwe@media.mit.edu>
Status: RO

Dear Dr. Chen -

Thank you for your message.  I was interested in your original post
to AUDITORY and in Dick Lyon's subsequent response.  My own interests
lie in functional modeling of the higher auditory system -- a field
which is now gaining some identity under the title of "Computational
Auditory Scene Analysis", broadly, computer models attempting to
reproduce the kinds of phenomena described in the book "Auditory
Scene Analysis" by psychologist Albert Bregman.  Although ultimately
the study of the neurophysiology of the auditory centers in the brain
will inform and (hopefully) confirm this work, my feeling is that
it is difficult to interpret such research at the moment, and we are
better able to make progress simply trying to reproduce the phenomena
by any method we can get to work.  Also, I am an engineer, not a
physiologist, so I suppose my bias is showing.

I am focusing on the problem of auditory event detection (building
a computer model able to predict when a listener will report that
a new 'event' has occured in a sound signal) and source separation
(systems that can partition acoustic energy into different groups
corresponding to percepts of independent sound sources);  my tools
are signal processing and the techniques of artificial intelligence,
and my inspiration comes from psychoacoustics and auditory neuro-
physiology.  You can read about my previous work in the following
short papers, available over the internet:

Ellis, D.P.W., Vercoe, B.L. (1992). <a
 href="ftp://sound.media.mit.edu/pub/Papers/asa-slc-92.ps.Z";>A perceptual
 representation of audio for sound source separation</a>
Presented to the 123rd meeting of the Acoustical Society of America, Salt Lake
 City.

Ellis, D.P.W. (1993).  <a
 href="ftp://sound.media.mit.edu/pub/Papers/waspaa93.ps.Z";>Hierarchic models of
 sound for separation and restoration</a>
Proc. 1993 IEEE Mohonk workshop on Applications of Signal Processing to
 Acoustics and Audio.

Ellis, D.P.W. (1994).  <a
 href="ftp://sound.media.mit.edu/pub/Papers/ICPR-94.ps.Z";>A computer
 implementation of psychoacoustic grouping rules</a>
Proc. 12th Intl. Conf. on Pattern Recognition, Jerusalem

You can find out more about the research in our group through our web
server, http://sound.media.mit.edu/, although it's not really all that
informative at the moment.  If you have trouble downloading the papers,
I can mail you paper copies.

I should be glad to keep in touch over the areas that interest you.

I was in Beijing (and various other places in China) in 1987, as a
tourist.  It was fascinating but quite austere.  How are things there now?

Best wishes,

--  DAn Ellis <dpwe@media.mit.edu>
    MIT Media Lab Perceptual Computing - Machine Listening group.

----------------------------
Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA04776
  (5.67b8/IDA-1.5 for chenke); Fri, 7 Apr 1995 23:36:09 +0800
Received: from research.att.com by pkuns.PKU.EDU.CN with SMTP id AA19376
  (5.67b/IDA-1.5 for chenke@pccms.pku.edu.cn); Fri, 7 Apr 1995 23:40:35 +0800
Received: by research.att.com; Fri Apr  7 11:37 EDT 1995
Received: (jba@localhost) by sear.research.att.com (940816.SGI.8.6.9/8.6.4) id
 PAA24205 for <chenke@pku.edu.cn>; Fri, 7 Apr 1995 15:36:30 GMT
Date: Fri, 7 Apr 1995 15:36:30 GMT
From: Jont Allen <jba@research.att.com>
Message-Id: <199504071536.PAA24205@sear.research.att.com>
To: Chen Ke <chenke@pku.edu.cn>
Subject: Re: Enquiry about the work on computational auditory model.
Status: RO

Ke,
I dont work that much on neural models, more on cochlear models.
One of the best pieces of work is that of SHamma, J Neuro PHy.
ALso there are some papers in IEEE acoustics and speech, by
Kuansan Wang (tiwain) and SHamma. Send Kuansan mail at
kuansan@research.att.com. He is one of my coworkers here at bell labs.
------------------
Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA10722
  (5.67b8/IDA-1.5 for chenke); Mon, 10 Apr 1995 23:02:07 +0800
Received: from research.att.com by pkuns.PKU.EDU.CN with SMTP id AA22000
  (5.67b/IDA-1.5 for chenke@pccms.pku.edu.cn); Mon, 10 Apr 1995 23:03:43 +0800
Received: by research.att.com; Mon Apr 10 10:39 EDT 1995
Received: (kuansan@localhost) by dayton.research.att.com
 (940816.SGI.8.6.9/8.6.4) id OAA26466 for <chenke@pku.edu.cn>; Mon, 10 Apr 1995
 14:39:04 GMT
Date: Mon, 10 Apr 1995 14:39:04 GMT
From: Kuansan  Wang <kuansan@research.att.com>
Message-Id: <199504101439.OAA26466@dayton.research.att.com>
To: Chen Ke <chenke@pku.edu.cn>
Subject: Re: Jont Allen
Status: RO

My PhD work was to establish an integrated computational model for
auditory processing, ranging from peripheral system to the primary
cortex (A1).  I have a mathematical model based on physiological data
on ferrets, and, from the data of psychoacoustic experiments, it
seems to work for human as well.  More experiments are designed being
and conducted by my thesis advisor and his colleagues, and they are
recruiting more students with engineering background (like myself) to
formulate observation and construct models.  The journal paper
documenting the first stage of these works is going to appear in IEEE
Trans on Speech and Audio this September.  However, a shorter version
has appeared in IEEE EMB magazine this March.

Since I joined Bell Labs, I have been applying my PhD work on speech
recognition.  Sounds like we have overlapped interests.  I'll be happy
to engage on more discussion and brain storm on this matter.  Please
don't hesitate to write me.

p.s. Are you attending any conference in US in the future by any chance?
This year's ASA meeting in Washington DC has a special session on central
auditory system.  It is held at the end of May.
--------------------------------------------------------------------------