[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[no subject]
>From tothl Thu Nov 5 12:16:39 +0100 1998 remote from inf.u-szeged.hu
Date: Thu, 5 Nov 1998 12:16:39 +0100 (MET)
From: Toth Laszlo <tothl@inf.u-szeged.hu>
X-Sender: tothl@csilla
To: auditory@lists.mcgill.ca
Subject: voiced/unvoiced detection
Message-ID: <Pine.SV4.3.91.981105115242.16577A-100000@csilla>
MIME-Version: 1.0
Received: from inf.u-szeged.hu by inf.u-szeged.hu; Thu, 5 Nov 1998 12:16 MET
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Length: 1397
Dear List,
there is a vast literature on estimating pitch based on simulations of
auditory processing. However, there seems to be much less information about
how to discriminate pitched and not pitched (noise-like) parts of a signal.
In speech processing the voiced/unvoiced decision is usually considered
more difficult than the measurement of pitch itself.
How would you measure how strong the sensation of "pitchedness"? Does
this make sense at all, or it is a binary decision, that is, we either
hear or don't hear a pitch?
Especially, I'm looking for ideas about how to make the voiced/unvoiced
detection of speech using auditory-like processing, eg. the summary
autocorrelogram. In this case I'd guess I should measure how strongly the
peak "dominates" the summary autocorrelogram. What would give a measure
of this? E.g. a narrower peak means more definite pitch sensation than a
wide, diffuse one? Or it is the height of the peak compared to its
neighborhood that counts? If so, how wide "neighborhood" should I check?
Laszlo Toth
Hungarian Academy of Sciences *
Research Group on Artificial Intelligence * "Experience is what you
e-mail: tothl@inf.u-szeged.hu * gain when you expected
http://www.inf.u-szeged.hu/~tothl * something else"
Email to AUDITORY should now be sent to AUDITORY@lists.mcgill.ca
LISTSERV commands should be sent to listserv@lists.mcgill.ca
Information is available on the WEB at http://www.mcgill.ca/cc/listserv