[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dynamic range and sample bit depth



Dan,

CD audio is 16 bits per sample (linear, uncompressed).  So that's a pretty good quality reference point.

For parts of a song that are 48 dB down from the peak, there are only 8 bits being used.  But those parts are pretty quiet, and the quantization noise is still about 96 dB down from the very loudest signals, which puts the noise somewhere near the absolute hearing threshold, and probably well below the masked threshold, if you have good normal hearing and the loudest sound you make is not painful.

Still, a few more bits would be nice.  24 is certainly overkill.

It's not clear to me what sort of dithering you have in mind or how it's going to help here.

Dick


On Wed, Dec 3, 2014 at 10:05 AM, Dan Goodman <d.goodman@xxxxxxxxxxxxxx> wrote:
Dear auditory list,

I have been worried about an issue to do with sampling bit depth and dynamic range for a while, and I have not yet been able to find a definitive answer. Hopefully some of you may be able to shed some light on this.

Essentially, the question revolves around the fact that with a digital signal, when it is attenuated you lose information. For example, for a signal with 16 bits per sample, if you attenuate by 20*log10(2^8)=48 dB then the output will only be using 8 bits per sample. Having listened to 8 bit sounds, they are clearly of very poor quality. So, although it is often written that the 'dynamic range' of 16 bit sound is 96 dB (=20*log10(2^16)), at even 48 dB of attenuation the quality becomes terribly poor.

So some questions:

1. How many bits per sample do we need for a high quality encoding of a sound without any attenuation?

2. How much dynamic range can we therefore get from standard audio systems using 16 or 24 bits per sample?

3. Are we routinely using more than this dynamic range in our experiments (and in musical recordings) and is this a problem for the results of, for example, studies mixing normal and hearing impaired listeners?

4. Is there anything we can do about this?

Some more details:

The clearest thing I have managed to find on this subject so far is a paper by Bob Stuart of Meridian Audio (https://www.meridian-audio.com/meridian-uploads/ara/coding2.pdf) that concludes that if you have 19 bits per sample at a 52 kHz sample rate, and you use dithering, and your audio system doesn't do any further processing on the sound, then at 90 dB attenuation from the maximum level you shouldn't hear any noise from the encoding (based on the extremes of measured hearing thresholds). This suggests that using 20 bit audio you can probably get 96 dB of high quality dynamic range (see below for why I mention 20 bit audio).

This doesn't take into account that many (most?) researchers are probably not dithering their signals. As far as I can tell, Matlab's wavplay and audioplayer functions do not use dithering, for example. So how much dynamic range are we getting without introducing noise if we don't use dithering? And are any of the commonly used packages for playing sounds doing this dithering?

Note: I mentioned 20 bit audio because I have read that 24 bit DACs only really use at most 22 bits of the signal, and due to thermal noise give about 20 bits noise free. I worked with one system in the past that I was told allowed you to select which 22 bits were used (although this was hardware specific and had to be coded in at a very low level, not using standard audio APIs).

I am very far from an expert on any of this, but what it seems like to me is that we need to be using 24 bit audio and (very importantly) dithering, and if so we can probably get 96 dB of high quality dynamic range. It is possible that in some experimental setups, especially if we're testing normal hearing and hearing impaired listeners across a wide range of sounds on the same system, that we might be exceeding this. If so, is there anything we can do?

Finally, any thoughts on the relevance for music / commercial audio? I guess it is much less of an issue there since the problems only seem to arise if you really push the limits of dynamic range.

Thanks in advance,
Dan Goodman