Re: [AUDITORY] Converting audio file from WAV to MP3 changes file duration. Why? (Neeraj Sharma )


Subject: Re: [AUDITORY] Converting audio file from WAV to MP3 changes file duration. Why?
From:    Neeraj Sharma  <neerajww@xxxxxxxx>
Date:    Wed, 15 Nov 2017 22:03:00 -0500
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--001a1146dc92ad12c0055e10dbc6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Members, Thank you for the suggestions and the useful links. In particular http://lame.sourceforge.net/tech-FAQ.txt states the reason for increase in duration for WAV to MP3. Loosely stating, the increase is due to zero-padding at start and end. The zero-padding in the start seems to be fixed (for the codec used) but that at end will depend on the input file duration (or number of samples). Why this is bothering me is: I have a sound stimuli created in WAV. I have created time stamps to map to certain significant "waveform events" in the signal. I will be playing back these stimuli in HTML, and due to certain requirement I have to use MP3s. The issue is that: 1. I do not have idea about the MP3 decoder used by the browser to decode the MP3s, and hence the duration of audio file will potentially, have some unknown alterations. 2. The time-stamps of the same events reported by listening to the stimuli through browser, will likely always have some (different) offset. Hence, estimating reaction time (as difference in the two timestamps) will always be overestimated. How much noise (in msec) in reaction time measurement for sound stimuli is insignificant? Any suggestions on this will help in deciding the relevance for correcting the offsets. Best regards, Neeks On Wed, Nov 15, 2017 at 5:15 AM, Julien Bloit <julien.bloit@xxxxxxxx> wrote: > Hi, > > Zero-padding is applied for filtering purposes, see a (rather old) > explanation here: > http://lame.sourceforge.net/tech-FAQ.txt > > A command line tool like "afinfo" will be able to tell you how many valid > audio frames are in the mp3, and which are the priming and reamainder > frames. > > Julien > > On Wed, Nov 15, 2017 at 8:57 AM, Windau, G.R.W. (G=C3=BCnter) < > G.Windau@xxxxxxxx> wrote: > >> Dear Neeks, >> >> Your wav audio files can have an arbitrary lenght, depending on the >> duration of the audio sample. The mp3 audio file however, is a sequence = of >> frames with a certain length in bytes, and thus also in duration. After >> going from wav to mp3 and back, you will see that the the duration of yo= ur >> audio sample has changed. I guess there will be some zero padding or sma= ll >> conversion artifacts before and after the 'real' audio. >> >> This may have been designed this way to prevent the introduction of >> audible clicks at the beginning and at the end when playing an mp3 file. >> >> If you need the duration of your audio files to be maintained, mp3 may >> not be what you want. >> >> Best wishes, >> G=C3=BCnter >> >> >> On 15 Nov 2017, at 08:02, Neeraj Sharma <neerajww@xxxxxxxx> wrote: >> >> Dear Members, >> >> An audio file in WAV can be converted to MP3 using following two >> utilities in unix terminal (both work, and there may be many more also): >> >> $ ffmpeg -i <input.wav> -codec:a libmp3lame -b:a 320k <output.mp3> >> </dev/null >> $ lame -q0 -b128 <input.wav> <output.wav> >> >> But the issue is that the duration of <output.mp3> is more than duration >> of <input.wav>. This is true with other utilities which I have tried, li= ke >> sox. Can anyone give insight on: >> >> a. why the duration is increasing? In the attached image below, the >> duration variation is plotted for 410 sound files. The increase in durat= ion >> appears to be WAV file dependent (although it is within 140ms in this ca= se) >> >> b. is there option in the above utilities which can reduce this >> difference in duration? I haven't been able to figure this out. >> >> Similar issue has been reported by few others also. >> Example: https://www.sweetwater.com/forums/showthread.php?42631 >> >> Best regards, >> Neeks >> >> >> <duration_var_wav_mp3.png> >> >> >> =E2=80=94 >> ing. G=C3=BCnter Windau | Technical Support Group | Dept. Biophysics | D= onders >> Institute for Brain, Cognition and Behaviour | Radboud University >> Nijmegen >> <https://maps.google.com/?q=3DNijmegen%C2%A0%7C%C2%A0Heyendaalseweg%0D+1= 35,+NL-6525AJ&entry=3Dgmail&source=3Dg> >> >> <https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3Chttps://maps.google.c= om/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalseweg%250D%2B135,%2BNL-65= 25AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaalseweg%0D+135,+NL-6525AJ&= entry=3Dgmail&source=3Dg> >> | >> <https://maps.google.com/?q=3DNijmegen%C2%A0%7C%C2%A0Heyendaalseweg%0D+1= 35,+NL-6525AJ&entry=3Dgmail&source=3Dg> >> >> <https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3Chttps://maps.google.c= om/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalseweg%250D%2B135,%2BNL-65= 25AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaalseweg%0D+135,+NL-6525AJ&= entry=3Dgmail&source=3Dg>Heyendaalseweg >> 135, NL-6525AJ Nijmegen >> <https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3Chttps://maps.google.c= om/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalseweg%250D%2B135,%2BNL-65= 25AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaalseweg%0D+135,+NL-6525AJ&= entry=3Dgmail&source=3Dg> >> | >> <https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3Chttps://maps.google.c= om/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalseweg%250D%2B135,%2BNL-65= 25AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaalseweg%0D+135,+NL-6525AJ&= entry=3Dgmail&source=3Dg> >> room 00.817 | E: G.Windau@xxxxxxxx | T: +31 24 3613356 >> <+31%2024%20361%203356> | W: http://www.mbfys.ru.nl/~gunter >> >> > --001a1146dc92ad12c0055e10dbc6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">Members,<div><br></div><div>Thank you for the suggestions = and the useful links.<div>In particular=C2=A0<a href=3D"http://lame.sourcef= orge.net/tech-FAQ.txt" style=3D"font-size:12.8px" target=3D"_blank">http://= lame.<wbr>sourceforge.net/tech-FAQ.txt</a>=C2=A0<wbr>states the reason for = increase in duration for WAV to MP3.<div>Loosely stating, the increase is d= ue to zero-padding at start and end. The zero-padding in the start seems to= be fixed (for the codec used) but that at end will depend on the input fil= e duration (or number of samples).</div><div><br></div><div>Why this is bot= hering me is:<br></div><div><br></div><div>I have a sound stimuli created i= n WAV. I have created time stamps to map to certain significant &quot;wavef= orm events&quot; in the signal.</div><div>I will be playing back these stim= uli in HTML, and due to certain requirement I have to use MP3s. The issue i= s that:</div><div>1. I do not have idea about the MP3 decoder used by the b= rowser to decode the MP3s, and hence the duration of audio file will potent= ially, have some unknown alterations.</div><div>2. The time-stamps of the s= ame events reported by listening to the stimuli through browser, will likel= y always have some (different) offset.</div><div>=C2=A0<br></div><div>Hence= , estimating reaction time (as difference in the two timestamps) will alway= s be overestimated.</div><div>How much noise (in msec) in reaction time mea= surement for=C2=A0sound stimuli is insignificant? Any suggestions on this w= ill help in deciding the relevance for correcting the offsets.=C2=A0</div><= div><br></div><div>Best regards,</div><div>Neeks</div></div></div></div><di= v class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, Nov 15, 2017= at 5:15 AM, Julien Bloit <span dir=3D"ltr">&lt;<a href=3D"mailto:julien.bl= oit@xxxxxxxx" target=3D"_blank">julien.bloit@xxxxxxxx</a>&gt;</span> wrot= e:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l= eft:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Hi,=C2=A0<div><br><di= v>Zero-padding is applied for filtering purposes, see a (rather old) explan= ation here:=C2=A0</div></div><div><a href=3D"http://lame.sourceforge.net/te= ch-FAQ.txt" target=3D"_blank">http://lame.sourceforge.net/<wbr>tech-FAQ.txt= </a><br></div><div><br></div><div>A command line tool like &quot;afinfo&quo= t; will be able to tell you how many valid audio frames are in the mp3, and= which are the priming and reamainder frames.</div><span class=3D"HOEnZb"><= font color=3D"#888888"><div><br></div><div>Julien</div></font></span></div>= <div class=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmail_extra"><br><div= class=3D"gmail_quote">On Wed, Nov 15, 2017 at 8:57 AM, Windau, G.R.W. (G= =C3=BCnter) <span dir=3D"ltr">&lt;<a href=3D"mailto:G.Windau@xxxxxxxx"= target=3D"_blank">G.Windau@xxxxxxxx</a>&gt;</span> wrote:<br><blockqu= ote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc s= olid;padding-left:1ex"> <div style=3D"word-wrap:break-word;line-break:after-white-space"> Dear Neeks, <div><br> </div> <div> <div>Your wav audio files can have an arbitrary lenght, depending on the du= ration of the audio sample. The mp3 audio file however, is a sequence of fr= ames with a certain length in bytes, and thus also in duration. After going= from wav to mp3 and back, you will see that the the duration of your audio sample has changed. I guess there = will be some zero padding or small conversion artifacts before and after th= e &#39;real&#39; audio.</div> <div><br> </div> <div>This may have been designed this way to prevent the introduction of au= dible clicks at the beginning and at the end when playing an mp3 file.</div= > <div><br> </div> <div>If you need the duration of your audio files to be maintained, mp3 may= not be what you want.</div> <div><br> </div> <div>Best wishes,</div> <div>G=C3=BCnter</div> <div><br> </div> <div><br> <blockquote type=3D"cite"><div><div class=3D"m_-6165061457879683785h5"> <div>On 15 Nov 2017, at 08:02, Neeraj Sharma &lt;<a href=3D"mailto:neerajww= @xxxxxxxx" target=3D"_blank">neerajww@xxxxxxxx</a>&gt; wrote:</div> <br class=3D"m_-6165061457879683785m_2354986082132685589Apple-interchange-n= ewline"> </div></div><div><div><div class=3D"m_-6165061457879683785h5"> <div dir=3D"ltr">Dear Members,<br> <br> An audio file in WAV can be converted to MP3 using following two utilities = in unix terminal (both work, and there may be many more also):<br> <br> $ ffmpeg -i &lt;input.wav&gt; -codec:a libmp3lame -b:a 320k &lt;output.mp3&= gt; &lt;/dev/null<br> $ lame -q0 -b128 &lt;input.wav&gt; &lt;output.wav&gt;<br> <br> But the issue is that the duration of &lt;output.mp3&gt; is more than durat= ion of &lt;input.wav&gt;. This is true with other utilities which I have tr= ied, like sox. Can anyone give insight on:<br> <br> <div>a. why the duration is increasing? In the attached image below, the du= ration variation is plotted for 410 sound files. The increase in duration a= ppears to be WAV file dependent (although it is within 140ms in this case)<= br> <br> <div>b. is there option in the above utilities which can reduce this differ= ence in duration?=C2=A0 I haven&#39;t been able to figure this out.<br> <br> Similar issue has been reported by few others also.<br> Example: <a href=3D"https://www.sweetwater.com/forums/showthread.php?42631"= target=3D"_blank"> https://www.sweetwater.com/for<wbr>ums/showthread.php?42631</a><br> <br> Best regards,<br> Neeks <div><br> </div> <div><br> </div> </div> </div> </div> </div></div><span id=3D"m_-6165061457879683785m_2354986082132685589cid:8231= 0ACF-9C5E-447E-A967-6D688383CBA4@xxxxxxxx">&lt;duration_var_wav_mp3.pn= g&gt;</span></div> </blockquote> </div> <br> <div> <div style=3D"text-align:start;text-indent:0px;word-wrap:break-word"> <div><font color=3D"#797979" face=3D"Verdana, sans-serif"><span style=3D"fo= nt-size:11px">=E2=80=94</span></font></div> <div style=3D"color:rgb(0,0,0);letter-spacing:normal;text-transform:none;wh= ite-space:normal;word-spacing:0px"> <span class=3D"m_-6165061457879683785m_2354986082132685589Apple-style-span"= style=3D"color:rgb(121,121,121);font-family:Calibri,sans-serif;font-size:1= 5px"><span lang=3D"NL" style=3D"font-size:8pt;font-family:Verdana,sans-seri= f">ing. G=C3=BCnter Windau | Technical Support Group=C2=A0</span></span><sp= an class=3D"m_-6165061457879683785m_2354986082132685589Apple-style-span" st= yle=3D"color:rgb(121,121,121);font-family:Calibri,sans-serif;font-size:15px= "><span lang=3D"NL" style=3D"font-size:8pt;font-family:Verdana,sans-serif">= |</span></span><span style=3D"color:rgb(121,121,121);font-family:Verdana">= =C2=A0</span><span class=3D"m_-6165061457879683785m_2354986082132685589Appl= e-style-span" style=3D"color:rgb(121,121,121);font-family:Calibri,sans-seri= f;font-size:15px"><span lang=3D"NL" style=3D"font-size:8pt;font-family:Verd= ana,sans-serif">Dept. Biophysics=C2=A0</span></span><span class=3D"m_-6165061457879683785m_23549= 86082132685589Apple-style-span" style=3D"color:rgb(121,121,121);font-family= :Calibri,sans-serif;font-size:15px"><span lang=3D"NL" style=3D"font-size:8p= t;font-family:Verdana,sans-serif">|=C2=A0</span></span><span class=3D"m_-61= 65061457879683785m_2354986082132685589Apple-style-span" style=3D"color:rgb(= 121,121,121);font-family:Verdana,sans-serif;font-size:11px">Donders Institute for Brain, Cognition and Behaviour</span><span style=3D"color:rg= b(121,121,121);font-family:Verdana">=C2=A0</span><span class=3D"m_-61650614= 57879683785m_2354986082132685589Apple-style-span" style=3D"color:rgb(121,12= 1,121);font-family:Calibri,sans-serif;font-size:15px"><span lang=3D"NL" sty= le=3D"font-size:8pt;font-family:Verdana,sans-serif">|=C2=A0</span></span><s= pan class=3D"m_-6165061457879683785m_2354986082132685589Apple-style-span" s= tyle=3D"color:rgb(121,121,121);font-family:Calibri,sans-serif;font-size:15p= x"><span lang=3D"NL" style=3D"font-size:8pt;font-family:Verdana,sans-serif"= >Radboud</span></span><span class=3D"m_-6165061457879683785m_23549860821326= 85589Apple-style-span" style=3D"color:rgb(121,121,121);font-family:Calibri,= sans-serif;font-size:15px"><span lang=3D"NL" style=3D"font-size:8pt;font-fa= mily:Verdana,sans-serif">=C2=A0University <a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C%C2%A0Heyendaalse= weg%0D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg" target=3D"_blank">N= ijmegen</a><a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3Chtt= ps://maps.google.com/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalseweg%2= 50D%2B135,%2BNL-6525AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaalseweg%= 0D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg">=C2=A0</a></span></span= ><span class=3D"m_-6165061457879683785m_2354986082132685589Apple-style-span= " style=3D"color:rgb(121,121,121);font-family:Calibri,sans-serif;font-size:= 15px"><span lang=3D"NL" style=3D"font-size:8pt;font-family:Verdana,sans-ser= if"><a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C%C2%A0Heyendaa= lseweg%0D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg" target=3D"_blank= ">|</a><a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%3Chttps:/= /maps.google.com/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalseweg%250D%= 2B135,%2BNL-6525AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaalseweg%0D+1= 35,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg">=C2=A0</a></span></span><sp= an class=3D"m_-6165061457879683785m_2354986082132685589Apple-style-span" st= yle=3D"color:rgb(121,121,121);font-family:Calibri,sans-serif;font-size:15px= "><span lang=3D"NL" style=3D"font-size:8pt;font-family:Verdana,sans-serif">= Heyendaalseweg 135, NL-6525AJ <a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C+%= 3Chttps://maps.google.com/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaalse= weg%250D%2B135,%2BNL-6525AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaals= eweg%0D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg">Nijmegen</a>=C2=A0= </span></span><span class=3D"m_-6165061457879683785m_2354986082132685589App= le-style-span" style=3D"color:rgb(121,121,121);font-family:Calibri,sans-ser= if;font-size:15px"><span lang=3D"NL" style=3D"font-size:8pt;font-family:Ver= dana,sans-serif"><a href=3D"https://maps.google.com/?q=3DNijmegen%C2%A0%7C+= %3Chttps://maps.google.com/?q%3DNijmegen%25C2%25A0%257C%25C2%25A0Heyendaals= eweg%250D%2B135,%2BNL-6525AJ%26entry%3Dgmail%26source%3Dg%3E%C2%A0Heyendaal= seweg%0D+135,+NL-6525AJ&amp;entry=3Dgmail&amp;source=3Dg">|</a>=C2=A0</span= ></span><span class=3D"m_-6165061457879683785m_2354986082132685589Apple-sty= le-span" style=3D"color:rgb(121,121,121);font-family:Calibri,sans-serif;fon= t-size:15px"><span lang=3D"NL" style=3D"font-size:8pt;font-family:Verdana,s= ans-serif">room 00.817=C2=A0</span></span><span class=3D"m_-6165061457879683785m_235498608= 2132685589Apple-style-span" style=3D"color:rgb(121,121,121);font-family:Cal= ibri,sans-serif;font-size:15px"><span lang=3D"NL" style=3D"font-size:8pt;fo= nt-family:Verdana,sans-serif">|</span></span><span class=3D"m_-616506145787= 9683785m_2354986082132685589Apple-style-span" style=3D"color:rgb(121,121,12= 1);font-family:Calibri,sans-serif;font-size:15px"><span lang=3D"NL" style= =3D"font-size:8pt;font-family:Verdana,sans-serif">=C2=A0E:=C2=A0<a href=3D"= mailto:G.Windau@xxxxxxxx" target=3D"_blank">G.Windau@xxxxxxxx<wbr>u.n= l</a>=C2=A0| T: <a href=3D"tel:+31%2024%20361%203356" value=3D"+31243613356" target=3D"= _blank">+31 24 3613356</a>=C2=A0</span></span><span class=3D"m_-61650614578= 79683785m_2354986082132685589Apple-style-span" style=3D"color:rgb(121,121,1= 21);font-family:Calibri,sans-serif;font-size:15px"><span lang=3D"NL" style= =3D"font-size:8pt;font-family:Verdana,sans-serif">|=C2=A0</span><span lang= =3D"NL" style=3D"font-size:8pt;font-family:Verdana,sans-serif">W:=C2=A0<a h= ref=3D"http://www.mbfys.ru.nl/~gunter" target=3D"_blank">http://www.mbfys.<= wbr>ru.nl/~gunter</a></span></span></div> </div> </div> <br> </div> </div> </blockquote></div><br></div> </div></div></blockquote></div><br></div> --001a1146dc92ad12c0055e10dbc6--


This message came from the mail archive
../postings/2017/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University