Re: [AUDITORY] Converting audio file from WAV to MP3 changes file duration. Why? (Spencer Russell )


Subject: Re: [AUDITORY] Converting audio file from WAV to MP3 changes file duration. Why?
From:    Spencer Russell  <sfr@xxxxxxxx>
Date:    Thu, 16 Nov 2017 09:33:47 -0500
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

This is a multi-part message in MIME format. --_----------=_15108428274469020 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If you're playing in a browser I would also worry about an unknown amount of latency caused by the browser and the system audio (probably on the order of 50-100ms). Are you in control of the computer used for the experiment or are you distributing the experiment over the web to be run on subjects' computers (or phones)? -s On Wed, Nov 15, 2017, at 10:03 PM, Neeraj Sharma wrote: > Members, >=20 > Thank you for the suggestions and the useful links. > In particular http://lame.sourceforge.net/tech-FAQ.txt states the > reason for increase in duration for WAV to MP3.> Loosely stating, the inc= rease is due to zero-padding at start and end. > The zero-padding in the start seems to be fixed (for the codec used) > but that at end will depend on the input file duration (or number of > samples).>=20 > Why this is bothering me is: >=20 > I have a sound stimuli created in WAV. I have created time stamps to > map to certain significant "waveform events" in the signal.> I will be pl= aying back these stimuli in HTML, and due to certain > requirement I have to use MP3s. The issue is that:> 1. I do not have idea= about the MP3 decoder used by the browser to > decode the MP3s, and hence the duration of audio file will > potentially, have some unknown alterations.> 2. The time-stamps of the= same events reported by listening to the > stimuli through browser, will likely always have some (different) > offset.>=20=20 > Hence, estimating reaction time (as difference in the two timestamps) > will always be overestimated.> How much noise (in msec) in reaction time = measurement for sound > stimuli is insignificant? Any suggestions on this will help in > deciding the relevance for correcting the offsets.>=20 > Best regards, > Neeks >=20 > On Wed, Nov 15, 2017 at 5:15 AM, Julien Bloit > <julien.bloit@xxxxxxxx> wrote:>> Hi,=20 >>=20 >> Zero-padding is applied for filtering purposes, see a (rather old) >> explanation here:>> http://lame.sourceforge.net/tech-FAQ.txt >>=20 >> A command line tool like "afinfo" will be able to tell you how many >> valid audio frames are in the mp3, and which are the priming and >> reamainder frames.>>=20 >>=20 >> Julien >>=20 >>=20 >> On Wed, Nov 15, 2017 at 8:57 AM, Windau, G.R.W. (G=C3=BCnter) >> <G.Windau@xxxxxxxx> wrote:>>> Dear Neeks,=20 >>>=20 >>> Your wav audio files can have an arbitrary lenght, depending on the >>> duration of the audio sample. The mp3 audio file however, is a >>> sequence of frames with a certain length in bytes, and thus also in >>> duration. After going from wav to mp3 and back, you will see that >>> the the duration of your audio sample has changed. I guess there >>> will be some zero padding or small conversion artifacts before and >>> after the 'real' audio.>>>=20 >>> This may have been designed this way to prevent the introduction >>> of audible clicks at the beginning and at the end when playing an >>> mp3 file.>>>=20 >>> If you need the duration of your audio files to be maintained, mp3 >>> may not be what you want.>>>=20 >>> Best wishes, >>> G=C3=BCnter >>>=20 >>>=20 >>>> On 15 Nov 2017, at 08:02, Neeraj Sharma <neerajww@xxxxxxxx> wrote:>>>= >=20 >>>> Dear Members, >>>>=20 >>>> An audio file in WAV can be converted to MP3 using following two >>>> utilities in unix terminal (both work, and there may be many more >>>> also):>>>>=20 >>>> $ ffmpeg -i <input.wav> -codec:a libmp3lame -b:a 320k <output.mp3> >>>> </dev/null>>>> $ lame -q0 -b128 <input.wav> <output.wav> >>>>=20 >>>> But the issue is that the duration of <output.mp3> is more than >>>> duration of <input.wav>. This is true with other utilities which I >>>> have tried, like sox. Can anyone give insight on:>>>>=20 >>>>=20 >>>> a. why the duration is increasing? In the attached image below, the >>>> duration variation is plotted for 410 sound files. The increase >>>> in duration appears to be WAV file dependent (although it is >>>> within 140ms in this case)>>>>=20 >>>>=20 >>>> b. is there option in the above utilities which can reduce this >>>> difference in duration? I haven't been able to figure this out.>>>= >=20 >>>> Similar issue has been reported by few others also. >>>> Example: https://www.sweetwater.com/forums/showthread.php?42631 >>>>=20 >>>> Best regards, >>>> Neeks=20 >>>>=20 >>>>=20 >>>> <duration_var_wav_mp3.png> >>>=20 >>> =E2=80=94 >>> ing. G=C3=BCnter Windau | Technical Support Group | Dept. Biophysics | >>> Donders Institute for Brain, Cognition and Behaviour | Radboud >>> University Nijmegen[1] |[2] Heyendaalseweg 135, NL-6525AJ >>> Nijmegen[3] |[4] room 00.817 | E: G.Windau@xxxxxxxx | T: +31 24 >>> 3613356[5] | W: http://www.mbfys.ru.nl/~gunter Links: 1. https://maps.google.com/?q=3DNijmegen%C2%A0%7C%C2%A0Heyendaalseweg%0D+= 135,+NL-6525AJ&entry=3Dgmail&source=3Dg 2. https://maps.google.com/?q=3DNijmegen%C2%A0%7C%C2%A0Heyendaalseweg%0D+= 135,+NL-6525AJ&entry=3Dgmail&source=3Dg 3. https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3C https://maps.google.com/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaa= lseweg%250D%2B135,%2BNL-6525AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyenda= alseweg%0D+135,+NL-6525AJ&entry=3Dgmail&source=3Dg 4. https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3C https://maps.google.com/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaa= lseweg%250D%2B135,%2BNL-6525AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyenda= alseweg%0D+135,+NL-6525AJ&entry=3Dgmail&source=3Dg 5. tel:+31%2024%20361%203356 --_----------=_15108428274469020 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset="utf-8" <!DOCTYPE html> <html> <head> <title></title> <style type=3D"text/css">p.MsoNormal,p.MsoNoSpacing{margin:0}</style> </head> <body><div>If you're playing in a browser I would also worry about an unkno= wn amount of latency caused by the browser and the system audio (probably o= n the order of 50-100ms). Are you in control of the computer used for the e= xperiment or are you distributing the experiment over the web to be run on = subjects' computers (or phones)?<br></div> <div><br></div> <div>-s</div> <div><br></div> <div>On Wed, Nov 15, 2017, at 10:03 PM, Neeraj Sharma wrote:<br></div> <blockquote type=3D"cite"><div dir=3D"ltr"><div>Members,<br></div> <div><br></div> <div><div>Thank you for the suggestions and the useful links.<br></div> <div><div>In particular&nbsp;<a href=3D"http://lame.sourceforge.net/tech-FA= Q.txt" style=3D"font-size:12.8px;">http://lame.<wbr>sourceforge.net/tech-FA= Q.txt</a>&nbsp;<wbr>states the reason for increase in duration for WAV to M= P3.<br></div> <div>Loosely stating, the increase is due to zero-padding at start and end.= The zero-padding in the start seems to be fixed (for the codec used) but t= hat at end will depend on the input file duration (or number of samples).<b= r></div> <div><br></div> <div>Why this is bothering me is:<br></div> <div><br></div> <div>I have a sound stimuli created in WAV. I have created time stamps to m= ap to certain significant "waveform events" in the signal.<br></div> <div>I will be playing back these stimuli in HTML, and due to certain requi= rement I have to use MP3s. The issue is that:<br></div> <div>1. I do not have idea about the MP3 decoder used by the browser to dec= ode the MP3s, and hence the duration of audio file will potentially, have s= ome unknown alterations.<br></div> <div>2. The time-stamps of the same events reported by listening to the sti= muli through browser, will likely always have some (different) offset.<br><= /div> <div>&nbsp;<br></div> <div>Hence, estimating reaction time (as difference in the two timestamps) = will always be overestimated.<br></div> <div>How much noise (in msec) in reaction time measurement for&nbsp;sound s= timuli is insignificant? Any suggestions on this will help in deciding the = relevance for correcting the offsets.&nbsp;<br></div> <div><br></div> <div>Best regards,<br></div> <div>Neeks<br></div> </div> </div> </div> <div><div><br></div> <div defang_data-gmailquote=3D"yes"><div>On Wed, Nov 15, 2017 at 5:15 AM, J= ulien Bloit <span dir=3D"ltr">&lt;<a href=3D"mailto:julien.bloit@xxxxxxxx"= >julien.bloit@xxxxxxxx</a>&gt;</span> wrote:<br></div> <blockquote defang_data-gmailquote=3D"yes" style=3D"margin-top:0px;margin-r= ight:0px;margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;border-l= eft-style:solid;border-left-color:rgb(204, 204, 204);padding-left:1ex;"><di= v dir=3D"ltr"><div>Hi,&nbsp;<br></div> <div><div><br></div> <div>Zero-padding is applied for filtering purposes, see a (rather old) exp= lanation here:&nbsp;<br></div> </div> <div><a href=3D"http://lame.sourceforge.net/tech-FAQ.txt">http://lame.sourc= eforge.net/<wbr>tech-FAQ.txt</a><br></div> <div><br></div> <div>A command line tool like "afinfo" will be able to tell you how many va= lid audio frames are in the mp3, and which are the priming and reamainder f= rames.<br></div> <div><span><span class=3D"colour" style=3D"color:rgb(136, 136, 136)"></span= ></span><br></div> <div><span><span class=3D"colour" style=3D"color:rgb(136, 136, 136)"></span= ></span><br></div> <div><span><span class=3D"colour" style=3D"color:rgb(136, 136, 136)">Julien= </span></span><br></div> <div><span><span class=3D"colour" style=3D"color:rgb(136, 136, 136)"></span= ></span><br></div> </div> <div><div><div><div><br></div> <div defang_data-gmailquote=3D"yes"><div>On Wed, Nov 15, 2017 at 8:57 AM, W= indau, G.R.W. (G=C3=BCnter) <span dir=3D"ltr">&lt;<a href=3D"mailto:G.Winda= u@xxxxxxxx">G.Windau@xxxxxxxx</a>&gt;</span> wrote:<br></div> <blockquote defang_data-gmailquote=3D"yes" style=3D"margin-top:0px;margin-r= ight:0px;margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;border-l= eft-style:solid;border-left-color:rgb(204, 204, 204);padding-left:1ex;"><di= v style=3D"word-wrap:break-word;"><div>Dear Neeks, <br></div> <div><br></div> <div><div>Your wav audio files can have an arbitrary lenght, depending on t= he duration of the audio sample. The mp3 audio file however, is a sequence = of frames with a certain length in bytes, and thus also in duration. After = going from wav to mp3 and back, you will see that the the duration of your audio sample has changed. I guess there = will be some zero padding or small conversion artifacts before and after th= e 'real' audio.<br></div> <div><br></div> <div>This may have been designed this way to prevent the introduction of au= dible clicks at the beginning and at the end when playing an mp3 file.<br><= /div> <div><br></div> <div>If you need the duration of your audio files to be maintained, mp3 may= not be what you want.<br></div> <div><br></div> <div>Best wishes,<br></div> <div>G=C3=BCnter<br></div> <div><br></div> <div><div><br></div> <blockquote type=3D"cite"><div><div><div>On 15 Nov 2017, at 08:02, Neeraj S= harma &lt;<a href=3D"mailto:neerajww@xxxxxxxx">neerajww@xxxxxxxx</a>&gt; = wrote:<br></div> <div><br></div> </div> </div> <div><div><div><div dir=3D"ltr"><div>Dear Members,<br></div> <div> <br></div> <div> An audio file in WAV can be converted to MP3 using following two util= ities in unix terminal (both work, and there may be many more also):<br></d= iv> <div> <br></div> <div> $ ffmpeg -i &lt;input.wav&gt; -codec:a libmp3lame -b:a 320k &lt;outpu= t.mp3&gt; &lt;/dev/null<br></div> <div> $ lame -q0 -b128 &lt;input.wav&gt; &lt;output.wav&gt;<br></div> <div> <br></div> <div> But the issue is that the duration of &lt;output.mp3&gt; is more than= duration of &lt;input.wav&gt;. This is true with other utilities which I h= ave tried, like sox. Can anyone give insight on:<br></div> <div> <br></div> <div> <br></div> <div><div>a. why the duration is increasing? In the attached image below, t= he duration variation is plotted for 410 sound files. The increase in durat= ion appears to be WAV file dependent (although it is within 140ms in this c= ase)<br></div> <div> <br></div> <div> <br></div> <div><div>b. is there option in the above utilities which can reduce this d= ifference in duration?&nbsp; I haven't been able to figure this out.<br></d= iv> <div> <br></div> <div> Similar issue has been reported by few others also.<br></div> <div> Example: <a href=3D"https://www.sweetwater.com/forums/showthread.php?= 42631"> https://www.sweetwater.com/for<wbr>ums/showthread.php?42631</a><br>= </div> <div> <br></div> <div> Best regards,<br></div> <div> Neeks <br></div> <div><br></div> <div><br></div> </div> </div> </div> </div> </div> <div><span>&lt;duration_var_wav_mp3.png&gt;</span><br></div> </div> </blockquote></div> <div><br></div> <div><div style=3D"text-align:start;text-indent:0px;word-wrap:break-word;">= <div><span class=3D"colour" style=3D"color:rgb(121, 121, 121)"><span class= =3D"font" style=3D"font-family:Verdana, &quot; sans-serif&quot;"><span clas= s=3D"size" style=3D"font-size:11px">=E2=80=94</span></span></span><br></div> <div style=3D"color:rgb(0, 0, 0);letter-spacing:normal;text-transform:none;= white-space:normal;word-spacing:0px;"><span class=3D"colour" style=3D"color= :rgb(121, 121, 121)"><span class=3D"font" style=3D"font-family:Calibri, san= s-serif"><span class=3D"size" style=3D"font-size:15px"><span class=3D"font"= style=3D"font-family:Verdana, sans-serif"><span class=3D"size" style=3D"fo= nt-size:8pt">ing. G=C3=BCnter Windau | Technical Support Group&nbsp;</span>= </span></span></span></span><span class=3D"colour" style=3D"color:rgb(121, = 121, 121)"><span class=3D"font" style=3D"font-family:Calibri, sans-serif"><= span class=3D"size" style=3D"font-size:15px"><span class=3D"font" style=3D"= font-family:Verdana, sans-serif"><span class=3D"size" style=3D"font-size:8p= t">|</span></span></span></span></span><span class=3D"colour" style=3D"colo= r:rgb(121, 121, 121)"><span class=3D"font" style=3D"font-family:Verdana">&n= bsp;</span></span><span class=3D"colour" style=3D"color:rgb(121, 121, 121)"= ><span class=3D"font" style=3D"font-family:Calibri, sans-serif"><span class= =3D"size" style=3D"font-size:15px"><span class=3D"font" style=3D"font-famil= y:Verdana, sans-serif"><span class=3D"size" style=3D"font-size:8pt">Dept. Biophysics&nbsp;</span></span></span></span></span><span class=3D"colour" = style=3D"color:rgb(121, 121, 121)"><span class=3D"font" style=3D"font-famil= y:Calibri, sans-serif"><span class=3D"size" style=3D"font-size:15px"><span = class=3D"font" style=3D"font-family:Verdana, sans-serif"><span class=3D"siz= e" style=3D"font-size:8pt">|&nbsp;</span></span></span></span></span><span = class=3D"colour" style=3D"color:rgb(121, 121, 121)"><span class=3D"font" st= yle=3D"font-family:Verdana, sans-serif"><span class=3D"size" style=3D"font-= size:11px">Donders Institute for Brain, Cognition and Behaviour</span></span></span><span cla= ss=3D"colour" style=3D"color:rgb(121, 121, 121)"><span class=3D"font" style= =3D"font-family:Verdana">&nbsp;</span></span><span class=3D"colour" style= =3D"color:rgb(121, 121, 121)"><span class=3D"font" style=3D"font-family:Cal= ibri, sans-serif"><span class=3D"size" style=3D"font-size:15px"><span class= =3D"font" style=3D"font-family:Verdana, sans-serif"><span class=3D"size" st= yle=3D"font-size:8pt">|&nbsp;</span></span></span></span></span><span class= =3D"colour" style=3D"color:rgb(121, 121, 121)"><span class=3D"font" style= =3D"font-family:Calibri, sans-serif"><span class=3D"size" style=3D"font-siz= e:15px"><span class=3D"font" style=3D"font-family:Verdana, sans-serif"><spa= n class=3D"size" style=3D"font-size:8pt">Radboud</span></span></span></span= ></span><span class=3D"colour" style=3D"color:rgb(121, 121, 121)"><span cla= ss=3D"font" style=3D"font-family:Calibri, sans-serif"><span class=3D"size" = style=3D"font-size:15px"><span class=3D"font" style=3D"font-family:Verdana,= sans-serif"><span class=3D"size" style=3D"font-size:8pt">&nbsp;University = <a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C%C2%A0Heyendaalsew= eg%0D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg">Nijmegen</a><a href= =3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3Chttps://maps.google.c= om/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalseweg%250D%2B135,%2BNL-65= 25AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaalseweg%0D+135,+NL-6525AJ&= amp;entry=3Dgmail&amp;source=3Dg">&nbsp;</a></span></span></span></span></s= pan><span class=3D"colour" style=3D"color:rgb(121, 121, 121)"><span class= =3D"font" style=3D"font-family:Calibri, sans-serif"><span class=3D"size" st= yle=3D"font-size:15px"><span class=3D"font" style=3D"font-family:Verdana, s= ans-serif"><span class=3D"size" style=3D"font-size:8pt"><a href=3D"https://= maps.google.com/?q=3DNijmegen%C2%A0%7C%C2%A0Heyendaalseweg%0D+135,+NL-6525A= J&amp;entry=3Dgmail&amp;source=3Dg">|</a><a href=3D"https://maps.google.com= /?q=3DNijmegen%C2%A0%7C+%3Chttps://maps.google.com/?q%3DNijmegen%25C2%25A0%= 257C%25C2%25A0Heyendaalseweg%250D%2B135,%2BNL-6525AJ%26entry%3Dgmail%26sour= ce%3Dg%3E%C2%A0Heyendaalseweg%0D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;sourc= e=3Dg">&nbsp;</a></span></span></span></span></span><span class=3D"colour" = style=3D"color:rgb(121, 121, 121)"><span class=3D"font" style=3D"font-famil= y:Calibri, sans-serif"><span class=3D"size" style=3D"font-size:15px"><span = class=3D"font" style=3D"font-family:Verdana, sans-serif"><span class=3D"siz= e" style=3D"font-size:8pt">Heyendaalseweg 135, NL-6525AJ <a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%= 3Chttps://maps.google.com/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalse= weg%250D%2B135,%2BNL-6525AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaals= eweg%0D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg">Nijmegen</a>&nbsp;= </span></span></span></span></span><span class=3D"colour" style=3D"color:rg= b(121, 121, 121)"><span class=3D"font" style=3D"font-family:Calibri, sans-s= erif"><span class=3D"size" style=3D"font-size:15px"><span class=3D"font" st= yle=3D"font-family:Verdana, sans-serif"><span class=3D"size" style=3D"font-= size:8pt"><a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3Chttp= s://maps.google.com/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalseweg%25= 0D%2B135,%2BNL-6525AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaalseweg%0= D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg">|</a>&nbsp;</span></span= ></span></span></span><span class=3D"colour" style=3D"color:rgb(121, 121, 1= 21)"><span class=3D"font" style=3D"font-family:Calibri, sans-serif"><span c= lass=3D"size" style=3D"font-size:15px"><span class=3D"font" style=3D"font-f= amily:Verdana, sans-serif"><span class=3D"size" style=3D"font-size:8pt">room 00.817&nbsp;</span></span></span></span></span><span class=3D"colour" styl= e=3D"color:rgb(121, 121, 121)"><span class=3D"font" style=3D"font-family:Ca= libri, sans-serif"><span class=3D"size" style=3D"font-size:15px"><span clas= s=3D"font" style=3D"font-family:Verdana, sans-serif"><span class=3D"size" s= tyle=3D"font-size:8pt">|</span></span></span></span></span><span class=3D"c= olour" style=3D"color:rgb(121, 121, 121)"><span class=3D"font" style=3D"fon= t-family:Calibri, sans-serif"><span class=3D"size" style=3D"font-size:15px"= ><span class=3D"font" style=3D"font-family:Verdana, sans-serif"><span class= =3D"size" style=3D"font-size:8pt">&nbsp;E:&nbsp;<a href=3D"mailto:G.Windau@xxxxxxxx= donders.ru.nl">G.Windau@xxxxxxxx<wbr>u.nl</a>&nbsp;| T: <a href=3D"tel:+31%2024%20361%203356">+31 24 3613356</a>&nbsp;</span></= span></span></span></span><span class=3D"colour" style=3D"color:rgb(121, 12= 1, 121)"><span class=3D"font" style=3D"font-family:Calibri, sans-serif"><sp= an class=3D"size" style=3D"font-size:15px"><span class=3D"font" style=3D"fo= nt-family:Verdana, sans-serif"><span class=3D"size" style=3D"font-size:8pt"= >|&nbsp;</span></span><span class=3D"font" style=3D"font-family:Verdana, sa= ns-serif"><span class=3D"size" style=3D"font-size:8pt">W:&nbsp;<a href=3D"h= ttp://www.mbfys.ru.nl/~gunter">http://www.mbfys.<wbr>ru.nl/~gunter</a></spa= n></span></span></span></span><br></div> </div> </div> </div> </div> </blockquote></div> </div> </div> </div> </blockquote></div> </div> </blockquote><div><br></div> </body> </html> --_----------=_15108428274469020--


This message came from the mail archive
../postings/2017/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University