Re: Question on defining S/N ratio in speech-in-noise testing (Harvey Holmes )


Subject: Re: Question on defining S/N ratio in speech-in-noise testing
From:    Harvey Holmes  <H.Holmes@xxxxxxxx>
Date:    Thu, 13 Aug 2009 23:16:35 +1000
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--=====================_31335827==.ALT Content-Type: text/plain; charset="us-ascii"; format=flowed Leo, Another measure that has been widely used in speech work (especially speech coding) is the segmental SNR (SNRseg or SEGSNR), originally introduced by Peter Noll in 1974. It overcomes many of the problems with a simple (global) SNR calculated over the entire duration of the signal, especially the bias introduced by silent segments. It allows for the non-stationary nature of speech signals by dividing the (long) noisy speech signal into short segments (e.g. of 16 ms duration) and calculating a logarithmic (dB) SNR value for each segment. The final SNRseg value is the average of these values after excluding very low energy segments. SNRseg and several other alternatives to a simple global SNR measure are discussed in Appendix E of the book "Digital Coding of Waveforms" by N.S. Jayant and P. Noll (Prentice-Hall, 1984). I think SNRseg should be calculated before pre-emphasis, as it is a property of the noisy signal itself and not of your processor. Regards, Harvey At 05:57 13/08/2009, Leonid Litvak wrote: >Hi All, > >I have a question regarding definition of signal-to-noise ratio as >it applies to speech-in-noise testing, with speech material being >sentences. On a simple level, SNR is just level of the signal >divided by the level of the noise. > >The signal is typically speech, so its level fluctuates over time. >Do people typically use the average signal level computed over the >whole sentence, average signal level computed in 100 ms windows, >medium signal level, maximum signal level, etc.? > >The same question could go for the noise token as well. > >I would very much appreciate references to papers that discuss these issues. > >Finally, we are interested to apply these tests to cochlear implant >recipients that have a well-characterized pre-emphasis curve as part >of their processor. Should the pre-emphasis curve be taken into >account when computing S/N ratios? This is not an issue for >spectrally-matched noises, but may be an issue for non-matched noises. > >Thank you very much! > >Leo --=====================_31335827==.ALT Content-Type: text/html; charset="us-ascii" <html> <body> <font size=3>Leo,<br><br> Another measure that has been widely used in speech work (especially speech coding) is the segmental SNR (SNRseg or SEGSNR), originally introduced by Peter Noll in 1974.&nbsp; It overcomes many of the problems with a simple (global) SNR calculated over the entire duration of the signal, especially the bias introduced by silent segments.<br><br> It allows for the non-stationary nature of speech signals by dividing the (long) noisy speech signal into short segments (e.g. of 16 ms duration) and calculating a logarithmic (dB) SNR value for each segment.&nbsp; The final SNRseg value is the average of these values after excluding very low energy segments.<br><br> SNRseg and several other alternatives to a simple global SNR measure are discussed in Appendix E of the book &quot;Digital Coding of Waveforms&quot; by N.S. Jayant and P. Noll (Prentice-Hall, 1984).<br><br> I think SNRseg should be calculated before pre-emphasis, as it is a property of the noisy signal itself and not of your processor.<br><br> Regards,<br><br> <x-tab>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</x-tab>Harvey<br> <br> At 05:57 13/08/2009, Leonid Litvak wrote:<br><br> <blockquote type=cite class=cite cite="">Hi All,<br><br> I have a question regarding definition of signal-to-noise ratio as it applies to speech-in-noise testing, with speech material being sentences. On a simple level, SNR is just level of the signal divided by the level of the noise. <br><br> The signal is typically speech, so its level fluctuates over time. Do people typically use the average signal level computed over the whole sentence, average signal level computed in 100 ms windows, medium signal level, maximum signal level, etc.? <br><br> The same question could go for the noise token as well.<br><br> I would very much appreciate references to papers that discuss these issues.<br><br> Finally, we are interested to apply these tests to cochlear implant recipients that have a well-characterized pre-emphasis curve as part of their processor. Should the pre-emphasis curve be taken into account when computing S/N ratios? This is not an issue for spectrally-matched noises, but may be an issue for non-matched noises.<br><br> Thank you very much!<br><br> Leo</font></blockquote></body> </html> --=====================_31335827==.ALT--


This message came from the mail archive
http://www.auditory.org/postings/2009/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University