Psycho-acoustic models ("Beerends, J.G." )


Subject: Psycho-acoustic models
From:    "Beerends, J.G."  <J.G.Beerends(at)RESEARCH.KPN.COM>
Date:    Wed, 18 Jun 1997 15:33:38 +0100

Dear List, Concerning the recent discussion on cochlear vs. psychoacoustic models the following: In developing a model for the prediction of audio quality of low bit-rate codecs (orignal in, degraded out) we found the following: 1) The behavior of the loudness grow above masked threshold is very important and can be modelled by smearing + compression (see AES december 1992 REF [1] ). 2) A number of peculiar cognitive effects play a role in quality judgement of audio, the most important of which is the asymmetry effect. When a codec leaves something out of the original the disturbance of the degradation is less then when something is introduced which does not belong to the original (see REF [2], [4], [7]). 3) For narrow band speech masking is of minor importance and the cognitive effect mentioned under 2) has to be modelled in order to obtain high correlations (see REF [3], [4], [5], [8]. In a benchmark by the International Telecommunication Union (Telecom sector) of five proposals for measuring speech quality our proposal (the Perceptual Speech Quality Measure, PSQM) scored highest with a correlation of around 0.97 on an unknown dataset. It was standardized as ITU-T recommendation P.861. In this PSQM method a very simpel modelling of the cognitive asymmetry effect is included. In a benchmark by the International Telecommunication Union (Radio sector) of six proposals for measuring audio (music) quality our proposal (the Perceptaul Audio Quality Measure, PAQM) scored highest with a correlation of around 0.85. Currently work is in progress to intgrate the best of all proposals. Concerning our work at KPN Research the following references may be of interest: REFERENCES [1] J. G. Beerends and J. A. Stemerdink. A perceptual audio quality measure based on a psychoacoustic sound representation. J. Audio Eng. Soc., 40:963-978, December 1992. [2] J. G. Beerends and J. A. Stemerdink. Modelling a cognitive aspect in the measurement of the quality of music codecs. Contribution to the 96th AES Convention, Amsterdam, February 1994, preprint 3800. [3] J. G. Beerends and J. A. Stemerdink. A perceptual speech quality measure based on a psychoacoustic sound representation. J. Audio Eng. Soc., 42:115-123, March 1994. [4] J. G. Beerends. Modelling cognitive effects that play a role in the perception of speech quality. Contribution to the DEGA/ITG/EURASIP Workshop on speech quality assessment, Bochum, November 1994. [5] J. G. Beerends. Measuring the quality of speech and music codecs, an integrated psychoacoustic approach. Contribution to the 98th AES Convention, Paris, February 1995, preprint 3945. [6] ITU-T Studygroup 12, Contribution COM 12-74, Review of validation tests for objective speech quality measures, March 1996. [7] J. G. Beerends, W. A. C. van den Brink, and B. Rodger. The role of informational masking and perceptual streaming in the measurement of music codec quality. Contribution to the 100th AES Convention, Copenhagen, May 1996, preprint 4176. [8] ITU-T, Recommendation P.861, Objective quality measurement of telephone-band (300-3400 Hz) speech codecs, August 1996. John G. Beerends KPN Research P.O. Box 421 2260 AK Leidschendam The Netherlands Tel +3170 3325644 Fax +3170 3326477 E-mail J.G.Beerends(at)research.kpn.com PS, The nice thing is we are now doing the same for video, finding the same effects.


This message came from the mail archive
http://www.auditory.org/postings/1997/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University