Subject: Re: Audio editing From: =?ISO-8859-2?Q?Pawe=B3_Ku=B6mierek?= <pawel.kusmierek@xxxxxxxx> Date: Thu, 20 Dec 2012 13:49:09 -0500 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>--90e6ba1efbc282097d04d14d31d8 Content-Type: text/plain; charset=ISO-8859-1 I just want to add that there is a function in Adobe Audition (or at least it was in version 3.0, not sure about newer versions) called Group Waveform Normalize which attempta to equate the perceived loudness of a set of files without clipping (or giving the user control over how much clipping is allowed). Optionally it takes into account the different hearing sensitivity at different frequency ranges. In my opinion it works pretty well. The main drawback that the algorithm behind it is unknown, and Adobe won't reveal it (I tried). As I understand it is based on RMS values (but who knows what window size) and some frequency weighing. Of course, there is also regular per-file Normalize which sets the peak at 0 dBfs (or at another value). Next, Amplitude Statistics can be used to calculate the RMS value of each file, which can be later used for amplitide adjustment using Amplify. This would have been done manually, as scripting in Audition is extremely limited (and might have been even dropped in recent versions). You would have to note RMS value (Total, Max or Average depending on your choice) and Peak value for each file. Then, a simple calculation will give you positive or negative value to be applied to each file using Amplify, and you will end up with all files having the same RMS value and not clipped. Note: if you use Audition 3.0, make sure that you patch it to 3.0.1. 3.0 had a bug that effected Amplitude Statistics. This procedure is also quite easily to automate in Matlab. Pawel On 19 December 2012 22:56, Kevin Austin <kevin.austin@xxxxxxxx> wrote: > Hi > > I was confused by the question which can be read in a number of ways. It > is not clear if there are [say] 10 files of 2 minutes duration, or 40 files > of 2 seconds. If there are 40 files, are their levels related to each other > such that the amount of signal amplification [read: normalization] needs to > remain constant across all 40 files, or is each file to be normalized > independently of the others. The approach would be different in each case. > > Normalization, applied correctly, will not clip a signal. The file will be > scanned for the peak level and this peak will be amplified by 'n dB' so > that the peak signal is "0 VU". This will not result in clipping if the > software is designed correctly. One of the difficulties experienced with > voice recording is that when very high quality mics have been used, there > is a strong likelihood of DC offset producing an asymmetrical waveform. > > The are manual techniques for achieving a kind of 'RMS normalization', but > they are labor intensive. As normalization does not change the relative > amplitude of signals within the file being processed, there should be no > detectable change in naturalness, IME. > > Kevin > > > > On 2012, Dec 18, at 12:34 PM, Matt Winn wrote: > > > Abin and List, > > Forgive double-postings, as I apparently made an error trying to attach > a file. It may be easier for you to do this in a scriptable (and free) > environment like Praat instead of Audition. I would like to share a simple > tool that I have made for this kind of intensity normalization. > > In the Praat script linked here, you can scale the intensities of all > sounds in a folder to a selected level. It will alert you if any of the > sounds clip, and offer you the option of decreasing your target intensity > level until none of them clip. In the end, you will have a folder full of > normalized sounds and an info text file to let you know what changes were > applied. The original sounds are preserved. > > This is designed to use for a folder full of short sounds (e.g. words), > and might not be ideal for longer sounds. It does not perform compression. > > Find the script here: > > > http://www.mattwinn.com/Scale_intensity_of_all_sounds_check_maxima_v2.txt > > To use it in Praat, either copy the text into a new Praat Script window > or open it directly. > > > > Regarding naturalness - you should be aware that compression and (to a > lesser extent) normalization actually decrease the naturalness of the > signals by altering each of them in different ways. There are some inherent > volume differences between some speech sounds (e.g. /s/ is louder than /f/, > /a/ is louder than /u/), so normalizing levels for these sounds would > decrease naturalness to some extent. > > Good luck, > > Matt > > > > > > > > > > On Mon, Dec 17, 2012 at 5:05 PM, Abin Kuruvilla Mathew < > amat527@xxxxxxxx> wrote: > > Dear All, > > > > I have a set of audio files (consonants and vowels) to be editied in > Adobe audition and was wondering to what extent and how much of > Normalization (RMS) and dynamic compression (if necessary) would be needed > so that the naturalness is preserved and clipping doesn't occur. > > > > kind regards, > > Abin > > > > -- > > Abin K. Mathew > > Doctoral student > > Department of Psychology (Speech Science) > > Tamaki Campus, 261 Morrin Road, Glen Innes > > The University of Auckland > > Private Bag 92019 > > Auckland- 1142 > > New Zealand > > Email: amat527@xxxxxxxx > > > > > > > --90e6ba1efbc282097d04d14d31d8 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div>I just want to add that there is a function in Adobe Audition (or at l= east it was in version 3.0, not sure about newer versions) called Group Wav= eform Normalize which attempta to equate the perceived loudness of a set of= files without clipping (or giving the user control over how much clipping = is allowed). Optionally it takes into account the different hearing sensiti= vity at different frequency ranges. In my opinion it works pretty well. The= main drawback that the algorithm behind it is unknown, and Adobe won't= reveal it (I tried). As I understand it is based on RMS values (but who kn= ows what window size) and some frequency weighing.</div> <div>=A0</div><div>Of course, there is also regular per-file Normalize whic= h sets the peak at 0 dBfs (or at another value). </div><div>=A0</div><div>N= ext, Amplitude Statistics can be used to calculate the RMS value of=A0each = file, which can be later used for amplitide adjustment using Amplify. This = would have been done manually, as scripting in Audition is extremely limite= d (and might have been even dropped in recent versions). You would have to = note RMS value (Total, Max or Average depending on your choice) and Peak va= lue for each file. Then, a simple calculation will give you positive or neg= ative value to be applied to each file using Amplify, and you will end up w= ith all files having the same RMS value and not clipped. Note: if you use A= udition 3.0, make sure that you patch it to 3.0.1. 3.0 had a bug that effec= ted Amplitude Statistics.</div> <div>=A0</div><div>This procedure is also quite easily to automate in Matla= b.</div><div>=A0</div><div>Pawel<br><br></div><div class=3D"gmail_quote">On= 19 December 2012 22:56, Kevin Austin <span dir=3D"ltr"><<a href=3D"mail= to:kevin.austin@xxxxxxxx" target=3D"_blank">kevin.austin@xxxxxxxx</= a>></span> wrote:<br> <blockquote style=3D"margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-= color:rgb(204,204,204);border-left-width:1px;border-left-style:solid" class= =3D"gmail_quote">Hi<br> <br> I was confused by the question which can be read in a number of ways. It is= not clear if there are [say] 10 files of 2 minutes duration, or 40 files o= f 2 seconds. If there are 40 files, are their levels related to each other = such that the amount of signal amplification [read: normalization] needs to= remain constant across all 40 files, or is each file to be normalized inde= pendently of the others. The approach would be different in each case.<br> <br> Normalization, applied correctly, will not clip a signal. The file will be = scanned for the peak level and this peak will be amplified by 'n dB'= ; so that the peak signal is "0 VU". This will not result in clip= ping if the software is designed correctly. One of the difficulties experie= nced with voice recording is that when very high quality mics have been use= d, there is a strong likelihood of DC offset producing an asymmetrical wave= form.<br> <br> The are manual techniques for achieving a kind of 'RMS normalization= 9;, but they are labor intensive. As normalization does not change the rela= tive amplitude of signals within the file being processed, there should be = no detectable change in naturalness, IME.<br> <br> Kevin<br> <br> <br> <br> On 2012, Dec 18, at 12:34 PM, Matt Winn wrote:<br> <br> > Abin and List,<br> > Forgive double-postings, as I apparently made an error trying to attac= h a file. It may be easier for you to do this in a scriptable (and free) en= vironment like Praat instead of Audition. I would like to share a simple to= ol that I have made for this kind of intensity normalization.<br> > In the Praat script linked here, you can scale the intensities of all = sounds in a folder to a selected level. It will alert you if any of the sou= nds clip, and offer you the option of decreasing your target intensity leve= l until none of them clip. In the end, you will have a folder full of norma= lized sounds and an info text file to let you know what changes were applie= d. The original sounds are preserved.<br> > This is designed to use for a folder full of short sounds (e.g. words)= , and might not be ideal for longer sounds. It does not perform compression= .<br> > =A0Find the script here:<br> > <a href=3D"http://www.mattwinn.com/Scale_intensity_of_all_sounds_check= _maxima_v2.txt" target=3D"_blank">http://www.mattwinn.com/Scale_intensity_o= f_all_sounds_check_maxima_v2.txt</a><br> > To use it in Praat, either copy the text into a new Praat Script windo= w or open it directly.<br> ><br> > Regarding naturalness - you should be aware that compression and (to a= lesser extent) normalization actually decrease the naturalness of the sign= als by altering each of them in different ways. There are some inherent vol= ume differences between some speech sounds (e.g. /s/ is louder than /f/, /a= / is louder than /u/), so normalizing levels for these sounds would decreas= e naturalness to some extent.<br> > =A0Good luck,<br> > Matt<br> ><br> ><br> ><br> ><br> > On Mon, Dec 17, 2012 at 5:05 PM, Abin Kuruvilla Mathew <<a href=3D"= mailto:amat527@xxxxxxxx" target=3D"_blank">amat527@xxxxxxxx= nz</a>> wrote:<br> > Dear All,<br> ><br> > I have a set of audio files (consonants and vowels) to be editied in A= dobe audition and was wondering to what extent and how much of Normalizatio= n (RMS) and dynamic compression (if necessary) would be needed so that the = naturalness is preserved and clipping doesn't occur.<br> ><br> > kind regards,<br> > Abin<br> ><br> > --<br> > Abin K. Mathew<br> > Doctoral student<br> > Department of Psychology (Speech Science)<br> > Tamaki Campus, 261 Morrin Road, Glen Innes<br> > The University of Auckland<br> > Private Bag 92019<br> > Auckland- 1142<br> > New Zealand<br> > Email: <a href=3D"mailto:amat527@xxxxxxxx" target=3D"_blank">= amat527@xxxxxxxx</a><br> ><br> ><br> ><br> </blockquote></div><br> --90e6ba1efbc282097d04d14d31d8--