Re: Voice Quality ("Kim, Doh-Suk (DS)" )


Subject: Re: Voice Quality
From:    "Kim, Doh-Suk (DS)"  <dsk(at)LUCENT.COM>
Date:    Wed, 12 Nov 2003 14:36:40 -0500

Srk, There is a standard recommendation P.862 from ITU-T. It is called, PESQ (perceptual evaluation of speech quality) and is designed to estimate the quality of speech processed by telephone networks. It basically compares a reference speech and it's degraded version to come up with the estimated quality of the degraded speech in MOS-scale. It's probably what you're looking for. You can find some references in the issue J. Audio Eng. Soc., October 2002. Regards, --- Doh-Suk Kim > -----Original Message----- > From: Seetharamakrishnan [mailto:seethark(at)ETH.NET] > Sent: Wednesday, November 12, 2003 12:01 PM > To: AUDITORY(at)LISTS.MCGILL.CA > Subject: Voice Quality > > > Dear Friends > > I am not an expert in voice analysis, but yet I have to > assess voice quality > from conversational speech. > ie Compare an "ideal voice" with spoken voice. ie The ideal > voice would be > recorded when the voice quality is good. And whenever the same person > speaks, his/her voice will be compared to this ideal voice > parameters and > deviations will be indicated. The content and duration of the > ideal voice > and spoken voice will be different. I have some software to > measure sound > analysis parameters like Intensity, Pitch, HNR, Mean DB, SD, Jitter, > Shimmer, Silence, Unvoiced frames etc.. > > I dont know how to correlate between the measured values and > the perceived > quality of voice. > Now my question is, what measurement parameters can be > reliably used in > order to compare the "ideal voice" and spoken voice and how ? > > Only criteria is that the spoken voice should have definitely deviated > qualitywise in some manner or other. > I am not able to arrive at what measurements I can reliably > and consistently > use to satisfy the above criteria though I know that certain > measurements > like mean DB, mean Pitch, silence percentage, number of > unvoiced frames, > voice breaks etc can be used. > > One more thing is, whether the how much window size (time in > seconds) should > be taken to arrive at some reliable comparison. > > Any light on this topic would be appreciated. > > Regards > srk >


This message came from the mail archive
http://www.auditory.org/postings/2003/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University