[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Audio editing

I just want to add that there is a function in Adobe Audition (or at least it was in version 3.0, not sure about newer versions) called Group Waveform Normalize which attempta to equate the perceived loudness of a set of files without clipping (or giving the user control over how much clipping is allowed). Optionally it takes into account the different hearing sensitivity at different frequency ranges. In my opinion it works pretty well. The main drawback that the algorithm behind it is unknown, and Adobe won't reveal it (I tried). As I understand it is based on RMS values (but who knows what window size) and some frequency weighing.
Of course, there is also regular per-file Normalize which sets the peak at 0 dBfs (or at another value).
Next, Amplitude Statistics can be used to calculate the RMS value of each file, which can be later used for amplitide adjustment using Amplify. This would have been done manually, as scripting in Audition is extremely limited (and might have been even dropped in recent versions). You would have to note RMS value (Total, Max or Average depending on your choice) and Peak value for each file. Then, a simple calculation will give you positive or negative value to be applied to each file using Amplify, and you will end up with all files having the same RMS value and not clipped. Note: if you use Audition 3.0, make sure that you patch it to 3.0.1. 3.0 had a bug that effected Amplitude Statistics.
This procedure is also quite easily to automate in Matlab.

On 19 December 2012 22:56, Kevin Austin <kevin.austin@xxxxxxxxxxxx> wrote:

I was confused by the question which can be read in a number of ways. It is not clear if there are [say] 10 files of 2 minutes duration, or 40 files of 2 seconds. If there are 40 files, are their levels related to each other such that the amount of signal amplification [read: normalization] needs to remain constant across all 40 files, or is each file to be normalized independently of the others. The approach would be different in each case.

Normalization, applied correctly, will not clip a signal. The file will be scanned for the peak level and this peak will be amplified by 'n dB' so that the peak signal is "0 VU". This will not result in clipping if the software is designed correctly. One of the difficulties experienced with voice recording is that when very high quality mics have been used, there is a strong likelihood of DC offset producing an asymmetrical waveform.

The are manual techniques for achieving a kind of 'RMS normalization', but they are labor intensive. As normalization does not change the relative amplitude of signals within the file being processed, there should be no detectable change in naturalness, IME.


On 2012, Dec 18, at 12:34 PM, Matt Winn wrote:

> Abin and List,
> Forgive double-postings, as I apparently made an error trying to attach a file. It may be easier for you to do this in a scriptable (and free) environment like Praat instead of Audition. I would like to share a simple tool that I have made for this kind of intensity normalization.
> In the Praat script linked here, you can scale the intensities of all sounds in a folder to a selected level. It will alert you if any of the sounds clip, and offer you the option of decreasing your target intensity level until none of them clip. In the end, you will have a folder full of normalized sounds and an info text file to let you know what changes were applied. The original sounds are preserved.
> This is designed to use for a folder full of short sounds (e.g. words), and might not be ideal for longer sounds. It does not perform compression.
>  Find the script here:
> http://www.mattwinn.com/Scale_intensity_of_all_sounds_check_maxima_v2.txt
> To use it in Praat, either copy the text into a new Praat Script window or open it directly.
> Regarding naturalness - you should be aware that compression and (to a lesser extent) normalization actually decrease the naturalness of the signals by altering each of them in different ways. There are some inherent volume differences between some speech sounds (e.g. /s/ is louder than /f/, /a/ is louder than /u/), so normalizing levels for these sounds would decrease naturalness to some extent.
>  Good luck,
> Matt
> On Mon, Dec 17, 2012 at 5:05 PM, Abin Kuruvilla Mathew <amat527@xxxxxxxxxxxxxxxxx> wrote:
> Dear All,
> I have a set of audio files (consonants and vowels) to be editied in Adobe audition and was wondering to what extent and how much of Normalization (RMS) and dynamic compression (if necessary) would be needed so that the naturalness is preserved and clipping doesn't occur.
> kind regards,
> Abin
> --
> Abin K. Mathew
> Doctoral student
> Department of Psychology (Speech Science)
> Tamaki Campus, 261 Morrin Road, Glen Innes
> The University of Auckland
> Private Bag 92019
> Auckland- 1142
> New Zealand
> Email: amat527@xxxxxxxxxxxxxxxxx