Re: [AUDITORY] Converting audio file from WAV to MP3 changes file duration. Why?

Hello Neeks,

If you want to stay with mp3 despite the last two comments, I have understood from your email that the zero padding at the start is fixed for the same codec. Therefore your timestamp should be fine, shouldn't it? You will always have the same starting offset across all samples.

Apart from that, I have conducted reaction time test recently and basically stumbled over two problems:

- to start with, I used key board presses as input into MaxMSP and found, that there was a buffering issue in the range of 20 to 70ms depending on the setting (as mentioned int he previous email). Yet, with enough data points collected, I ended up with a similar RT distribution and results as I got afterwards, having corrected for the buffering. I also read somewhere that special keys such as alt and ctrl can be handled in real time by the computer if that is requested by the software......

- the analysis of the data is not trivial, and I would recommend reading this paper "To transform or not to transform: using generalized linear mixed models to analyse reaction time data" by Lo and Andrews to start with.

- from the analysis of my results, differences in the modes of RT distribution of as little as 10 ms may reach significance if there are enough data points. These differences are a lot smaller than any interparticipant differences, or differences you may encounter due to variations in the response speed between e.g. left and right hand. Therefore the rather complex statistical analysis as outlined in the paper above.

Good luck with the tests,

Hanne

If you're playing in a browser I would also worry about an unknown amount of latency caused by the browser and the system audio (probably on the order of 50-100ms). Are you in control of the computer used for the experiment or are you distributing the experiment over the web to be run on subjects' computers (or phones)?

-s

On Wed, Nov 15, 2017, at 10:03 PM, Neeraj Sharma wrote:

Members,

Thank you for the suggestions and the useful links.

In particular http://lame.sourceforge.net/tech-FAQ.txt states the reason for increase in duration for WAV to MP3.

Loosely stating, the increase is due to zero-padding at start and end. The zero-padding in the start seems to be fixed (for the codec used) but that at end will depend on the input file duration (or number of samples).

Why this is bothering me is:

I have a sound stimuli created in WAV. I have created time stamps to map to certain significant "waveform events" in the signal.

I will be playing back these stimuli in HTML, and due to certain requirement I have to use MP3s. The issue is that:

1. I do not have idea about the MP3 decoder used by the browser to decode the MP3s, and hence the duration of audio file will potentially, have some unknown alterations.

2. The time-stamps of the same events reported by listening to the stimuli through browser, will likely always have some (different) offset.

Hence, estimating reaction time (as difference in the two timestamps) will always be overestimated.

How much noise (in msec) in reaction time measurement for sound stimuli is insignificant? Any suggestions on this will help in deciding the relevance for correcting the offsets.

Best regards,

Neeks

On Wed, Nov 15, 2017 at 5:15 AM, Julien Bloit <julien.bloit@xxxxxxxxx> wrote:

Hi,

Zero-padding is applied for filtering purposes, see a (rather old) explanation here:

http://lame.sourceforge.net/tech-FAQ.txt

A command line tool like "afinfo" will be able to tell you how many valid audio frames are in the mp3, and which are the priming and reamainder frames.

Julien

On Wed, Nov 15, 2017 at 8:57 AM, Windau, G.R.W. (Günter) <G.Windau@xxxxxxxxxxxxx> wrote:

Dear Neeks,

Your wav audio files can have an arbitrary lenght, depending on the duration of the audio sample. The mp3 audio file however, is a sequence of frames with a certain length in bytes, and thus also in duration. After going from wav to mp3 and back, you will see that the the duration of your audio sample has changed. I guess there will be some zero padding or small conversion artifacts before and after the 'real' audio.

This may have been designed this way to prevent the introduction of audible clicks at the beginning and at the end when playing an mp3 file.

If you need the duration of your audio files to be maintained, mp3 may not be what you want.

Best wishes,

Günter

On 15 Nov 2017, at 08:02, Neeraj Sharma <neerajww@xxxxxxxxx> wrote:

Dear Members,

An audio file in WAV can be converted to MP3 using following two utilities in unix terminal (both work, and there may be many more also):

$ ffmpeg -i <input.wav> -codec:a libmp3lame -b:a 320k <output.mp3> </dev/null

$ lame -q0 -b128 <input.wav> <output.wav>

But the issue is that the duration of <output.mp3> is more than duration of <input.wav>. This is true with other utilities which I have tried, like sox. Can anyone give insight on:

a. why the duration is increasing? In the attached image below, the duration variation is plotted for 410 sound files. The increase in duration appears to be WAV file dependent (although it is within 140ms in this case)

b. is there option in the above utilities which can reduce this difference in duration? I haven't been able to figure this out.

Similar issue has been reported by few others also.

Example: https://www.sweetwater.com/forums/showthread.php?42631

Best regards,

Neeks

<duration_var_wav_mp3.png>

—

ing. Günter Windau | Technical Support Group | Dept. Biophysics | Donders Institute for Brain, Cognition and Behaviour | Radboud University Nijmegen | Heyendaalseweg 135, NL-6525AJ Nijmegen | room 00.817 | E: G.Windau@donders.ru.nl | T: +31 24 3613356 | W: http://www.mbfys.ru.nl/~gunter