Re: Voice Quality (Rahul Shrivastav )

Subject: Re: Voice Quality
From:    Rahul Shrivastav  <rahul(at)CSD.UFL.EDU>
Date:    Wed, 12 Nov 2003 13:44:35 -0500

SRK, You have raised a question that is not easy to answer, and here are my two cents on these issues: First, perceived voice quality results from multiple acoustic changes, and the measures you have described will all have a role in the quantifying the final percept. Some of these measures (eg. Jitter, shimmer) are influenced by the phonetic environment, fundamental frequency and the window size used for analyses - so you will have to be very careful when you make your measurements. Second, listener's judgment of voice quality are most likely also influenced by suprasegmental and paralinguistic factors (age and sex of speaker, emotions, etc.). I do not know if these factors come into play in your work. Third, as of today, I am not aware of any "standard" formula to put all the various measures together and come up with a single index of "good voice quality." While some efforts towards this are ongoing, these often look at one or two sub-types of quality. If you have a pre-determined set of stimuli that you consider "ideal", you may be able to use some sort of distance measure to determine how the given stimulus is different from the ideal. Most likely, you will have to apply some time-normalization procedures (e.g. dynamic time-warping) before you calculate these measures. However, it seems like your stimuli differ in their content and this may not be possible. If you do not have a pre-determined "ideal" stimulus, you may want to choose a set of measures that you believe will adequately reflect the nature of the voice qualities that you expect to see in your set. You can then compare the measures from your stimuli to the published normative data on that measure. The voices closest to the ideal would likely be the ones that have minimal deviations from the normative set. Hope this helps! I am curious to know what other folks on the list have to say -- would you mind sharing the information you get with me? Thanks, Rahul ---------------------------- Rahul Shrivastav, Ph.D. Assistant Professor Communication Sciences and Disorders Dauer Hall, Room 48 Gainesville FL 32611 Phone: (352) 392-2046 (ext. 230) Fax: (352) 392-6170 ---------------------------- -----Original Message----- From: AUDITORY Research in Auditory Perception [mailto:AUDITORY(at)LISTS.MCGILL.CA] On Behalf Of Seetharamakrishnan Sent: Wednesday, November 12, 2003 12:01 PM To: AUDITORY(at)LISTS.MCGILL.CA Subject: Voice Quality Dear Friends I am not an expert in voice analysis, but yet I have to assess voice quality from conversational speech. ie Compare an "ideal voice" with spoken voice. ie The ideal voice would be recorded when the voice quality is good. And whenever the same person speaks, his/her voice will be compared to this ideal voice parameters and deviations will be indicated. The content and duration of the ideal voice and spoken voice will be different. I have some software to measure sound analysis parameters like Intensity, Pitch, HNR, Mean DB, SD, Jitter, Shimmer, Silence, Unvoiced frames etc.. I dont know how to correlate between the measured values and the perceived quality of voice. Now my question is, what measurement parameters can be reliably used in order to compare the "ideal voice" and spoken voice and how ? Only criteria is that the spoken voice should have definitely deviated qualitywise in some manner or other. I am not able to arrive at what measurements I can reliably and consistently use to satisfy the above criteria though I know that certain measurements like mean DB, mean Pitch, silence percentage, number of unvoiced frames, voice breaks etc can be used. One more thing is, whether the how much window size (time in seconds) should be taken to arrive at some reliable comparison. Any light on this topic would be appreciated. Regards srk

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University