Re: Segregation with neural nets (Alain de Cheveigne )


Subject: Re: Segregation with neural nets
From:    Alain de Cheveigne  <alain(at)LINGUIST.JUSSIEU.FR>
Date:    Thu, 22 Apr 1993 12:37:04 +0100

>Hi, >I'm searching for literature on algorithms of sound segregation. >Tried somebody to use neural nets ?. > >With best regards, Bernhard. I have a paper coming out in June issue of JASA. It mentions neural mechanisms, but nothing that resembles (formal) neural nets. Below is the bibliography, containing several major references on sound separation. See also Guy Brown's recent thesis. Alain. --- Assmann, P. F.,and Summerfield, Q. (1988). "Pitch-pulse asynchrony and the perceptual segregation of competing voices," Speech 88 conference (7th FASE), Edinburgh, 531-538. Assmann, P. F. and Summerfield, Q. (1989). "Modeling the perception of concurrent vowels: Vowels with the same fundamental frequency," J. Ac. Soc. Am.85, 327-338. Assmann, P. F. and Summerfield, Q. (1990). "Modeling the perception of concurrent vowels: vowels with different fundamental frequencies," J. Ac. Soc. Am. 88, 680-697. Bregman, A. S. (1990). Auditory scene analysis (MIT Press, Cambridge, Mass.), 773 p. Brokx, J. P. L. and Nooteboom, S. G. (1982). "Intonation and the perceptual separation of simultaneous voices," Journal of Phonetics 10, 23-36. Carney, H. and Yin, T. C. T. (1988). "Temporal coding of resonances by low-frequency auditory nerve fibers: single fiber responses and a population model," J. Neurophysiol. 60, 1653-1677. Carney, L. H. and Yin, T. C. T. (1989). "Responses of low-frequency cells in the inferior colliculus to interaural time differences of clicks: excitatory and inhibitory components," J. Neurophysiol. 62, 144-161. Carr, C. E. and Konishi, M. (1990). "A circuit for detection of interaural time differences in the brain stem of the barn owl," J. Neuroscience 10, 3227-3246. Chan, J. C. K., Yin, T. C. T. and Musicant, A. D. (1987). "Effects of interaural time delays of noise stimuli on low-frequency cells in the cat's inferior colliculus. II. Responses to band-pass filtered noises," J. Neurophysiol. 58, 543-561. Cherry, E. C. (1953). "Some experiments on the recognition of speech with one, and with two ears," J. Ac. Soc. Am. 25, 975-979. de Cheveigne, A. (1986). "A pitch perception model," Proc. IEEE ICASSP, 897-900. de Cheveigne, A. (1990). "F0 estimation from mixed speech," ATR Auditory and visual perception research labs technical report TR-A-0097. de Cheveigne, A. (1991). "A mixed speech F0 estimation algorithm," Proc. ESCA (Eurospeech), Genova, 445-448. Childers, D. G. and Lee, C. K. (1987). "Co-channel speech separation," Proc. IEEE ICASSP, 181-184. Colburn, H. S. and Durlach, N. I. (1978). "Models of binaural interaction," in Handbook of perception, edited by E. C. Carterette and M. P. Friedman (Academic Press, New York), 467-518. Darwin, C. J. and Culling, J. F. (1990). "Speech perception seen through the ear," Speech Communication 9, 469-475. Delgutte, B. (1984). "Speech coding in the auditory nerve: II. Processing schemes for vowel-like sounds," JASA 75, 879-886. Duifhuis, H., Willems, L. F. and Sluyter, R. J. (1982). "Measurement of pitch in speech: an implementation of Goldstein's theory of pitch perception," J. Ac. Soc. Am. 1568-1580. Durlach, N. I. (1963). "Equalization and cancellation theory of binaural masking-level differences," J. Ac. Soc. Am. 35, 1206-1218. Evans, E. F. (1983). "Pitch and cochlear nerve fibre temporal discharge patterns," in Hearing-Physiological bases and psychophysics, edited by R. Klinke & R. Hartmann, (Springer-Verlag, Berlin), 140-146 Fletcher (1929). Speech and Hearing (van Norstrand, New York). Frazier, R. H., Samsam, S., Braida, L. D. and Oppenheim, A. V. (1976). "Enhancement of speech by adaptive filtering," Proc. IEEE ICASSP, 251-253. Goldstein, J. L. and Srulovicz, P. (1977). "Auditory-nerve spike intervals as an adequate basis for aural frequency measurement," in Psychophysics and physiology of hearing, edited by E. F. Evans and J. P. Wilson (Academic Press, London), 337-347. Hanson, B. A. and Wong, D. Y. (1984). "The harmonic magnitude suppression (HMS) technique for intelligibility enhancement in the presence of interfering noise," IEEE ICASSP, 2, pp. 18A.5.1-4. Hess, W. (1983). Pitch determination of speech signals (Springer-Verlag, Berlin), 698 p. Holdsworth, J., Nimmo-Smith, I., Patterson,R. D., and Rice, P. (1988). "Implementing a GammaTone filter bank," MRC Applied Psychology Unit technical report. Horst, J. W., Javel, E. , Farley, G. R. (1986). "Coding of spectral fine structure in the auditory nerve. I. Fourier analysis of period and interspike interval interval histograms," J. Ac. Soc. Am. 79, 398-416. Javel, J. B., Mott,. B., Rush, N. L. and Smith, D. W. (1988). "Frequency discrimination: evaluation of rate and temporal codes," in Basic issues in hearing, edited by H. Duifhuis, J. W. Horst and H. P. Wit (Academic Press, London), 224-234. Jeffress, L. A. (1948). "A place theory of sound localization," J. Comp. Physiol. Psychol. 41, 35-39. Konishi, M., Takahashi, T. T., Wagner, H., Sullivan, W. E. and Carr, C. E. (1988). "Neurophysiological and anatomical substrates of sound localization in the owl," in Auditory function - neurobiological bases of hearing, edited by G. M. Edelman, W. E. Gall and W. M. Cowan (Wiley, New York), 721-745. Kopec, G. E. and Bush, M. A. (1989). "An LPC-based spectral similarity measure for speech recognition in the presence of co-channel speech interference," Proc. IEEE ICASSP, 270-273. Kuwabara, H., Sagisaka, Y., Takeda, K. and Abe, M. (1989). "Construction of ATR Japanese speech database as a research tool," ATR Interpreting telephony research laboratories technical report TR-I-0086. Kuwada, S., Yin, T. C. T., Haberly, L. B. and Wickesberg, R. E. (1980). "Binaural interaction in the cat inferior colliculus: physiology and anatomy," in Psychophysical, physiological and behavioral studies in hearing, edited by G. v. d. Brink and F. A. Bilsen (Delft University Press), 401-411. Langner, G. (1981). "Neuronal mechanisms for pitch analysis in the time domain," Exp. Brain Res. 44, 450-454. Langner, G. and Schreiner, C. E. (1988). "Periodicity coding in the inferior colliculus of the cat. I. Neuronal mechanisms," J. Neurophysiol. 60, 1799-1822. Lea, A. (1992). "Auditory models of vowel perception," unpublished doctoral dissertation, University of Nottingham. Lea, A. P. and Summerfield, Q. (1992). "Monaural segregation of competing voices," Proc. ASJ committee on Hearing H-92-31, pp. 1-7. Licklider, J. C. R. (1956). "Auditory frequency analysis," in Information theory, edited by C. Cherry (Butterworth, London), 253-268. Licklider, J. C. R. (1959). "Three auditory theories," in Psychology, a study of a science, edited by S. Koch (McGraw-Hill), 41-144. Licklider, J. C. R. (1962). "Periodicity pitch and related auditory process models," International Audiology 1, 11-36. Lyon, R. (1984). "Computational models of neural auditory processing," Proc. IEEE ICASSP, 36.1.(1-4). Lyon, R. F. (1983-1989). "A computational model of binaural localization and separation," Proc. IEEE ICASSP, reproduced in Natural computation, edited by W. Richards (MIT Press, Cambridge, Mass), 319-327. McAdams, S. (1989). "Segregation of concurrent sounds. I: Effects of frequency modulation coherence," J. Ac. Soc. Am. 86, 2148-2159. McFadden, D. (1973). "Precedence effects and auditory cells with long characteristic delays," J. Ac. Soc. Am. 54, 528-530. Meddis, R. and Hewitt, M. (1988). "A computational model of low pitch judgement," in Basic issues in hearing, edited by H. Duifuis, J. W. Horst and H. P. Witt (Academic, London), 148-153. Meddis, R. and Hewitt, M. J. (1991a). "Virtual pitch and phase sensitivity of a computer model of the auditory periphery. I: pitch identification," J. Ac. Soc. Am. 89, 2866-2882. Meddis, R. and Hewitt, M. J. (1991b). "Virtual pitch and phase sensitivity of a computer model of the auditory periphery. II: phase sensitivity," J. Ac. Soc. Am. 89, 2883-2894. Meddis, R. and Hewitt, M. J. (1992). "Modelling the identification of concurrent vowels with different fundamental frequencies," J. Ac. Soc. Am. 91, 233-245. Min, K., Chien, D., Li, S. and Jones, C. (1988). "Automated two speaker separation system," Proc. IEEE ICASSP, 537-540. Moore, B. C. J. (1982). An introduction to the psychology of hearing (Academic Press, London). Moore, B. C. J. and Glasberg, B. R. (1983). "Suggested formulae for calculating auditory-filter bandwidths and excitation patterns," J. Ac. Soc. Am. 74, 750-753. M ller, A. R. (1977a). "Frequency selectivity of single auditory-nerve fibers in response to broadband noise stimuli," J. Ac. Soc. Am. 62, 135-142. M ller, A. R. (1977b). "Frequency selectivity of the basilar membrane revealed from discharges in auditory nerve fibers," in Psychophysics and physiology of hearing, edited by E. F. Evans, & J. P. Wilson, (Academic Press, London), 197-207. Nagabuchi, H., Kobayashi, T. and Yamamoto, H. (1979). "Speech enhancement and suppression in mixed speech," Transactions of the IECE (Japan) 62, 627-634 (in Japanese). Naylor, J. A. and Boll, S. F. (1987). "Techniques for suppression of an interfering talker in co-channel speech," Proc. ICASSP, 205-208. Palmer, A. R. (1988). "The representation of concurrent vowels in the temporal discharge patterns of auditory nerve fibers," in Basic issues in hearing, edited by H. Duifhuis, J. W. Horst and H. P. Wit (Academic Press, London), 244-251. Palmer, A. R. (1990). "The representation of the spectra and fundamental frequencies of steady-state single- and double-vowel sounds in the temporal discharge patterns of guinea pig cochlear-nerve fibers," J. Acoust. Soc. Am. 88, 1412-1426. Palmer, A. R., Rees, A. and Caird, D. (1990). "Interaural delay sensitivity to tones and broad band signals in the guinea-pig inferior colliculus," Hearing Research 50, 71-86. Palmer, A. R. (1992). "Segregation of the responses to paired vowels in the auditory nerve of the guinea-pig using autocorrelation," in Audition speech and language, edited by B. Schouten (Mouton-DeGruyter, Berlin), (in press). Parsons, T. W. (1976). "Separation of speech from interfering speech by means of harmonic selection," J. Ac. Soc. Am. 60, 911-918. Patterson, R. D., Robinson, K., Holdsworth, J., McKeown, D., Zhang, C., and Allerhand, M. (1992). "Complex sounds and auditory images," in Auditory physiology and perception, edited by Y. Cazals, L. Demany and K. Horner (Pergamon, Oxford), 429-446. de Ribaupierre, F., Rouiller. E., Toros A., and de Ribaupierre. Y. (1980). "Transmission delay of phase-locked cells in the medial geniculate body," Hearing Research 3, 65-77. Ross, M. J., Shaffer, H. L., Cohen, A., Freudberg, R. and Manley, H. J. (1974). "Average magnitude difference function pitch extractor," IEEE Trans. ASSP 22, 353-362. Ruggero, M. A. (1973). "Response to noise of auditory nerve fibers in the squirrel monkey," J. Neurophysiol. 36, 569-587. Sachs, M. B., & Young, E. D. (1979). "Encoding of steady-state vowels in the auditory nerve: representation in terms of discharge rate", J. Ac. Soc. Am. 66, 470-479. Scheffers, M. T. M. (1983). "Sifting vowels," thesis, University of Groningen. Schreiner, C. E. and Langner, G. (1988a). "Coding of temporal patterns in the central auditory nervous system," in Auditory function - Neurobiological bases of hearing, edited by G. M. Edelman, W. E. Gall and W. M. Cowan (Wiley, New York), 337-361. Schreiner, C. E. and Langner, G. (1988b). "Periodicity coding in the inferior colliculus of the cat. II. Topographical organization," J. Neurophysiol. 60, 1823-1840. Schroeder, M. R. (1968). "Period histogram and product spectrum: new methods for fundamental-frequency measurement," J. Ac. Soc. Am. 43, 829-834. Schubert, E. (1978). "History of research on hearing," in Handbook of perception, edited by E. C. Carterette and M. P. Friedman (Academic Press, New York), 41-80. Silva, F. M. and Almeida, L. B. (1990). "Speech separation by means of stationary least-squares harmonic estimation," Proc. IEEE ICASSP, 809-812. Stubbs, R. J. and Summerfield, Q. (1988). "Evaluation of two voice-separation algorithms using normal-hearing and hearing-impaired listeners," J. Ac. Soc. Am. 84, 1236-1249. Stubbs, R. J. and Summerfield, Q. (1990). "Algorithms for separating the speech of interfering talkers: evaluations with voiced sentences, and normal-hearing and hearing-impaired listeners," J. Ac. Soc. Am. 87, 359-372. Stubbs, R. J. and Q. Summerfield. (1991a). "Effects of signal-to-noise ratio, signal periodicity, and degree of hearing impairment on the performance of voice-separation algorithms," JASA 89, pp. 1383-1393. Summerfield, Q. and P. F. Assmann. (1991b). "Perception of concurrent vowels: effects of harmonic misalignment and pitch-period asynchrony," J. Ac. Soc. Am. 89, pp. 1364-1377. van Noorden, L. (1982). "Two channel pitch perception," in Music, mind, and brain, edited by M. Clynes (Plenum press, London), 251-269. Weintraub, M. (1985). "A theory and computational model of auditory monaural sound separation," thesis, Stanford University. Weintraub, M. (1986). "A computational model for separating two simultaneous sounds," Proc. IEEE ICASSP, Tokyo, 1, 3.1.1-4. Wickesberg, R. E. and Oertel, D. (1990). "Delayed, frequency-specific inhibition in the cochlear nuclei of mice: a mechanism for monaural echo suppression," J. Neuroscience 10, pp. 1762-1768. Yin, T. C. T. and Chan, J. C. K. (1990). "Interaural time sensitivity in medial superior olive of cat," J. Neurophysiol. 64, 465-488. Yin, T. C. T., Chan, J. C. K. and Carney, L. H. (1987). "Effects of interaural time delays of noise stimuli on low-frequency cells in the cat's inferior colliculus. III. Evidence for cross-correlation," J. Neurophysiol. 58, 562-583. Young, E. D. and Sachs, M. B. (1979). "Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers," J. Ac. Soc. Am. 66, 1381-1403. --- ------------------------------------------------------------------ Alain de Cheveigne' phone: (33) (1) 44273633 fax: (33) (1) 44277919 e-mail: alain(at)linguist.jussieu.fr NeXT-format: nalain(at)linguist.jussieu.fr japanese: jalain(at)linguist.jussieu.fr mail: Laboratoire de Linguistique Formelle, CNRS / Universite' Paris 7, case 7003, 2 place Jussieu, 75251 Paris CEDEX 05, FRANCE. ------------------------------------------------------------------


This message came from the mail archive
http://www.auditory.org/postings/1993/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University