[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Converting audio file from WAV to MP3 changes file duration. Why?



I would be *very* wary of and audio/visual/interface timing using such
protocol. For example, different sound card hardware configurations will
have different playback buffer lengths and latencies. So that  there can be
100ms or more delay between sending the play command and sound actually
being played. You cannot control this from the browser, and it is doubtful
you can even access this driver information. I would seriously avoid such a
protocol for anything where timing is important, and where you want to
compare results between individuals. A common hardware setup is usually
required for such type of test. Worse yet, the delay for playing can vary
during an experiment under some OS, as if there is sufficient silence
preceding the sound card can "go to sleep" meaning there is an additional
wake-up delay, or loss of the first several buffers. This is something I
have clearly observed under Windows, where we needed to play a "silent"
track continuously to avoid...

As others have said, any use of lossy compression (i.e. MPEG) should be
avoided in research unless you are actually studying the effects of
compression. In the case of multichannel audio, studies have shown
detrimental effects of compression on localization cues as a function of
codec and bitrate. These include interaural time delays between channels,
meaning that inter-channel *timing* cues are altered in some compression
methodologies.

The issue of the audio file length varying in size (where MPG length will
need to a multiple of the encoder block size, while WAV doesn't have this
limitation) will have negligible influence (unless your timing is based on a
loop playback counter) when compared to these other more important points.

I would think you need to do a good series of protocol evaluations on
different systems to see what the range of timing errors can be, between
browsers, OS (including smartphones), current CPU load, audio hardware, etc.
It may be that the timing errors you may have exceed the scale of the
phenomenon you are trying to study. 

At least, that's my thoughts.

--
Brian FG Katz, Ph.D, HDR
Research Director, CNRS
Institut d'Alembert , Sorbonne Universités, UPMC Univ Paris 06, CNRS bureau
510, 5ème, allé des tours 55-65 4, Place Jussieu
75252 Paris cedex 05

Tel: (+33) 01 44 27 80 34
web_perso: http://www.dalembert.upmc.fr/home/katz
web_lab: http://www.dalembert.upmc.fr
web_group: http://www.dalembert.upmc.fr/lam      

-----Original Message-----
From: AUDITORY - Research in Auditory Perception
[mailto:AUDITORY@xxxxxxxxxxxxxxx] On Behalf Of Bob Masta
Sent: Thursday, November 16, 2017 2:26 PM
To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: [AUDITORY] Converting audio file from WAV to MP3 changes file
duration. Why?

Don't most browsers allow playing of .WAV files directly? 

That would solve your immediate problem, and also eliminate a much more
insidious one that could come back to bite you down the road:  Namely, that
MP3 uses perceptual coding.  It's essentially based on "tricking" the
auditory system by omitting things it doesn't think most people will miss.
It's a compromise between accuracy and file size.  It was intended for
"consumer" applications, not basic research.

MP3 might be good enough for whatever you are doing, but in general it seems
risky for auditory experiments that are trying to determine what people can
perceive.  

Best regards,

Bob Masta

===================


On 15 Nov 2017 at 22:03, Neeraj Sharma wrote:

> 
> Members,
> 
> Thank you for the suggestions and the useful links. 
> In particular http://lame.sourceforge.net/tech-FAQ.txt ; states the 
> reason for increase in duration for WAV to MP3.
> Loosely stating, the increase is due to zero-padding at start and end. 
> The zero-padding in the start seems to be fixed (for the codec used) 
> but that at end will depend on the input file duration (or number of
samples).
> 
> Why this is bothering me is:
> 
> I have a sound stimuli created in WAV. I have created time stamps to 
> map to certain significant "waveform events" in the signal.
> I will be playing back these stimuli in HTML, and due to certain
requirement I have to use MP3s. 
> The issue is that:
> 1. I do not have idea about the MP3 decoder used by the browser to 
> decode the MP3s, and hence the duration of audio file will 
> potentially,
have some unknown alterations.
> 2. The time-stamps of the same events reported by listening to the 
> stimuli through browser, will likely always have some (different) offset.
>  
> Hence, estimating reaction time (as difference in the two timestamps) 
> will always be overestimated.
> How much noise (in msec) in reaction time measurement for sound 
> stimuli is insignificant? Any suggestions on this will help in 
> deciding
the relevance for correcting the offsets.
> 
> Best regards,
> Neeks
> 
> On Wed, Nov 15, 2017 at 5:15 AM, Julien Bloit <julien.bloit@xxxxxxxxx>
wrote:
>     Hi,
> 
>     Zero-padding is applied for filtering purposes, see a 
>     (rather old) explanation here: 
>     http://lame.sourceforge.net/tech-FAQ.txt
> 
>     A command line tool like "afinfo" will be able to tell you 
>     how many valid audio frames are in the mp3, and which 
>     are the priming and reamainder frames.
> 
>     Julien
> 
>     On Wed, Nov 15, 2017 at 8:57 AM, Windau, G.R.W. 
>     (Günter) <G.Windau@xxxxxxxxxxxxx> wrote:
>     Dear Neeks,
> 
>     Your wav audio files can have an arbitrary lenght, 
>     depending on the duration of the audio sample. The 
>     mp3 audio file however, is a sequence of frames with 
>     a certain length in bytes, and thus also in duration. 
>     After going from wav to mp3 and back, you will see 
>     that the the duration of your audio sample has 
>     changed. I guess there will be some zero padding or 
>     small conversion artifacts before and after the 'real' 
>     audio.
> 
>     This may have been designed this way to prevent the 
>     introduction of audible clicks at the beginning and at 
>     the end when playing an mp3 file.
> 
>     If you need the duration of your audio files to be 
>     maintained, mp3 may not be what you want.
> 
>     Best wishes,
>     Günter
> 
> 
>     On 15 Nov 2017, at 08:02, Neeraj Sharma 
>     <neerajww@xxxxxxxxx> wrote:
> 
>     Dear Members,
>     
>     An audio file in WAV can be converted to MP3 
>     using following two utilities in unix terminal (both 
>     work, and there may be many more also):
>     
>     $ ffmpeg -i <input.wav> -codec:a libmp3lame -b:a 
>     320k <output.mp3> </dev/null
>     $ lame -q0 -b128 <input.wav> <output.wav>
>     
>     But the issue is that the duration of <output.mp3> 
>     is more than duration of <input.wav>. This is true 
>     with other utilities which I have tried, like sox. Can 
>     anyone give insight on:
> 
>     a. why the duration is increasing? In the attached 
>     image below, the duration variation is plotted for 
>     410 sound files. The increase in duration appears 
>     to be WAV file dependent (although it is within 
>     140ms in this case)
> 
>     b. is there option in the above utilities which can 
>     reduce this difference in duration?  I haven't been 
>     able to figure this out.
>     
>     Similar issue has been reported by few others 
>     also.
>     Example: 
>     https://www.sweetwater.com/forums/showthread.
>     php?42631
>     
>     Best regards,
>     Neeks
> 
> 
>     <duration_var_wav_mp3.png>
> 
>     -
>     ing. Günter Windau | Technical Support Group |  Dept. Biophysics |
Donders Institute for Brain, Cognition and 
>     Behaviour | Radboud  University Nijmegen  |  Heyendaalseweg 135,
NL-6525AJ Nijmegen  |  room 00.817 | 
>      E: G.Windau@xxxxxxxxxxxxx | T: +31 24 3613356 | W: 
> http://www.mbfys.ru.nl/~gunter
> 
> 
>