Hi Andy. This is a very tricky question. I am not aware of any definitive data that really addresses the issue adequately.
Laboratory studies tend to use single sound sources in anechoic conditions. The auditory system copes very well in these conditions. The results indicate
that listeners can cope with very low SNRs (e.g. -10 dB for spatialised speech- shaped noise interference, and lower for speech interferers). Moreover, some
studies have used several interferers (e.g. Peissig and Kollmeier '97, Hawley et al.
'04), and shown a gradual elevation in SRT with increasing numbers of interferers. Simulating a more complex scene, like a restaurant with multiple interferers and reverberation produces progressive degradation, though. We have been simulating up to eight interfering voices from a variety of speakers with reverb based on real-room binaural room impulse responses. SRTs are around -2 to -3 dB with eigth interfering voices. I haven't begun to write this work up yet, but
the results are not disimilar to those from a cruder preliminary study published here. Culling, J. F.
(2013). “Energetic and informational masking in a simulated restaurant environment” in Moore, B. C. J., Carlyon, R. P., and Gockel, H., Patterson, R. D. and Winter, I. M.. (eds)
Basic Aspects of Hearing: Physiology and Perception (Springer, New York) There remain limitations to this approach, of course. The technique remains dependent on standard target speech materials (IEEE/Harvard sentences) that are not very typical of normal conversation - particularly lacking a
conversational context. It is also unclear whether 50% keyword intelligibility is a tolerable level of comprehension for conversation. Karolina's study has other limitations. If I remember correctly, the material was recorded from hearing impaired individuals, who may avoid the more severe listening conditions into which normally hearing people thrust
themselves. Also, the method of establishing the SNR from the recordings would probably become impossible below a certain SNR, as it relies on a researcher judging from the recordings alone whether or not target
speech is present. Noise level is collected from epochs without target speech, and speech level is derived by subtraction. Nonetheless, both approaches indicate that real-world SNRs are unlikely to be very near -10 dB, but be somewhere around 0 dB. Karolina's work suggests a bit above, mine a bit below. I guess what is really needed is for pairs of interlocutors to be wired up with
close microphones at the mouth (to establish reliably who is talking when)
and at their ears, and then to go out for the night and try to produce normal speaking and listening behaviour. Perhaps after a few nights of this they
would habituate to all the kit, and produce data that will get us closer to a
true answer.
John From: AUDITORY - Research in Auditory Perception [mailto:AUDITORY@xxxxxxxxxxxxxxx]
On Behalf Of Andy Sabin Hi List, Can anyone point me to a reference showing SNRs that are typically observed in public spaces (e.g., restaurants, bars ...etc)? I can find this info for overall SPL, but am having a hard time finding it for SNR. Thanks Andy Sabin |