Dear Auditory list,
It's my pleasure to announce the
following technical report
available via WWW.
Thanks for your
attention,
Guoning
Hu
***************************************************************
"Monaural
speech segregation based on pitch tracking and
amplitude
modulation"
Technical Report #6, March
2002
Department of Computer and Information Science
The Ohio State
University
***************************************************************
Guoning Hu, The Ohio State
University
DeLiang Wang, The Ohio
State University
Speech segregation in the monaural condition has proven
to be very
challenging. Monaural speech segregation has been studied in
previous
systems that incorporate auditory scene analysis principles. A major
problem
for these systems is their inability to deal with speech in
the
high-frequency range. Psychoacoustic evidence suggests that
different
perceptual mechanisms are involved in handling resolved and
unresolved
harmonics. We propose a system that deals with resolved and
unresolved
harmonics differently. For resolved harmonics, the system
generates segments
based on temporal continuity and cross-channel
correlation, and groups them
according to their periodicities. For unresolved
harmonics, it generates
segments based on common amplitude modulation (AM) in
addition to temporal
continuity and groups them according to AM repetition
rates derived from
sinusoidal modeling. Underlying the segregation process is
a pitch contour
that is first estimated from speech segregated according to
global pitch and
then adjusted according to psychoacoustic constraints. Our
system is
systematically evaluated, and it yields substantially better
performance
than previous systems, especially in the high-frequency
range.
For WWW:
http://www.cis.ohio-state.edu/~hu/Publication/TR6.pdf
Related sound demos can be found
at:
http://www.cis.ohio-state.edu/~hu/Publication/MSSDemo.htm
Preliminary versions (in pdf) of this work are included in
2001 IEEE WASPAA and 2002 IEEE ICASSP.