[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Psycho-acoustic models



Dear List,

Concerning the recent discussion on cochlear vs. psychoacoustic models
the following:

In developing a model for the prediction of audio quality of low
bit-rate codecs (orignal in, degraded out) we found the following:

1) The behavior of the loudness grow above masked threshold is very
important and can be modelled by smearing  + compression (see AES
december 1992 REF [1] ).
2) A number of peculiar cognitive effects play a role in quality
judgement of audio, the most important of which is the asymmetry effect.
When a codec leaves something out of the original the disturbance of the
degradation is less then when something is introduced which does not
belong to the original (see REF [2], [4], [7]).
3) For narrow band speech masking is of minor importance and the
cognitive effect mentioned under 2) has to be modelled in order to
obtain high correlations (see REF [3],
[4], [5], [8].

In a benchmark by the International Telecommunication Union (Telecom
sector) of five proposals for measuring speech quality our proposal (the
Perceptual Speech Quality Measure, PSQM) scored highest with a
correlation of around 0.97 on an unknown dataset. It was standardized as
ITU-T recommendation P.861. In this PSQM method a very simpel modelling
of the cognitive asymmetry effect is included.

In a benchmark by the International Telecommunication Union (Radio
sector) of six proposals for measuring audio (music) quality our
proposal (the Perceptaul Audio Quality Measure, PAQM) scored highest
with a correlation of around 0.85. Currently work is in progress to
intgrate the best of all proposals.

Concerning our work at KPN Research the following references may be of
interest:

REFERENCES
[1] J. G. Beerends and J. A. Stemerdink. A perceptual audio quality
measure based on a psychoacoustic sound representation.  J. Audio Eng.
Soc., 40:963-978, December 1992.

[2] J. G. Beerends and J. A. Stemerdink. Modelling a cognitive aspect in
the measurement of the quality of music codecs. Contribution to the 96th
AES Convention, Amsterdam, February 1994, preprint 3800.

[3] J. G. Beerends and J. A. Stemerdink. A perceptual speech quality
measure based on a psychoacoustic sound representation.  J. Audio Eng.
Soc., 42:115-123, March 1994.

[4] J. G. Beerends. Modelling cognitive effects that play a role in the
perception of speech quality. Contribution to the DEGA/ITG/EURASIP
Workshop on speech quality assessment, Bochum, November 1994.

[5] J. G. Beerends. Measuring the quality of speech and music codecs, an
integrated psychoacoustic approach. Contribution to the 98th AES
Convention, Paris, February 1995, preprint 3945.

[6] ITU-T Studygroup 12, Contribution COM 12-74, Review of validation
tests for objective  speech quality measures, March 1996.

[7] J. G. Beerends, W. A. C. van den Brink, and B. Rodger.  The role of
informational masking and perceptual streaming in the measurement of
music codec quality. Contribution to the 100th AES Convention,
Copenhagen, May 1996, preprint 4176.

[8] ITU-T, Recommendation P.861, Objective quality measurement of
telephone-band (300-3400 Hz) speech codecs, August 1996.




John G. Beerends
KPN Research
P.O. Box 421
2260 AK  Leidschendam
The Netherlands
Tel    +3170   3325644
Fax   +3170   3326477
E-mail  J.G.Beerends@research.kpn.com



PS, The nice thing is we are now doing the same for video, finding the
same effects.