[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Audio Time and/or Level Alignment Algorithm

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Audio Time and/or Level Alignment Algorithm
From: Dan Stowell <dan.stowell@xxxxxxxxxxxxxxx>
Date: Thu, 3 Jul 2008 12:36:20 +0100
Approved-by: dan.stowell@xxxxxxxxxxxxxxx
Delivery-date: Thu Jul 3 07:39:15 2008
In-reply-to: <59125.140.203.209.150.1215081108.squirrel@xxxxxxxxxxxxxxxxxxxxx>
List-archive: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>
List-help: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO AUDITORY>
List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>
List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>
List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>
References: <1214997078.757000-29977168-1087@xxxxxxxxx> <012f01c8dced$efa00260$1f0ce682@xxxxxxxxx> <486CAF6F.8020304@xxxxxxxxxxxxxxx> <59125.140.203.209.150.1215081108.squirrel@xxxxxxxxxxxxxxxxxxxxx>
Reply-to: Dan Stowell <dan.stowell@xxxxxxxxxxxxxxx>
Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>
User-agent: Thunderbird 2.0.0.14 (Macintosh/20080421)

Hi -

I don't have any source code for you, I'm afraid. But if you take theenvelopes of the two signals, then for each of your candidatetime-delays you have a set of 2D data points (use the same number ofpoints for each time-delay). There are various ways to estimate themutual information from that, then you just look for the time-delaygiving the largest mutual information.

The quick and dirty way to get a MI value is to calculate a 2D histogramand the two corresponding 1D histograms (representing the marginals ofthe 2D histogram) and calculate the mutual information from theentropies of those 3 distributions (I(X;Y) = H(X) + H(Y) - H(X,Y)). Thisis a rough approach but easy and quick. There are various tweaks youcould add such as adaptive bin widths etc.


There are more accurate approaches. One is:

Estimation of the information by an adaptive partitioning of theobservation space

Darbellay and Vajda (1999)
http://dx.doi.org/10.1109/18.761290

I'm not much of a vision person but I believe MI is used for imagealignment fairly often - a very similar task. It's discusse in thispaper (which also describes another approach to the estimation, usingNearest-Neighbours):

High-Dimensional Entropy Estimation for Finite Accuracy Data: R-NNEntropy Estimator

Kybic (2007)
http://dx.doi.org/10.1007/978-3-540-73273-0_47


HTH
Dan


Dermot Campbell wrote:

Hi Dan,

Im also interested in this.

Do you have any more information on what you proposed or even source code?

Thanks,
Dermot.


------------------------------------------------
Dermot Martin Campbell,
Postgraduate Research Student,
Dept. of Electronic Eng.,
National University of Ireland,
Galway City.
Tel:(091) 493031
Email: Dermot.Campbell@xxxxxxxxxxxx

Hi -

You might like to consider using mutual information as an alternative to
cross-correlation. The advantage is that cross-correlation is always
about the linear dependencies between the signals, whereas the mutual
information can also highlight nonlinear dependencies (for example, in
your case, the codec may have added compression).

Dan



Junyong You wrote:

Hi John and all,

In fact, I am also looking for the time alignment of two samples, one is
the original, and another is decoded. My problem is to estimate the time
delay caused by audio coding.

I try a classifical estimation method, which makes use of envelope based
cross-correlation function. That means, the envelopes of two samples are
computed firstly, and then calculate the cross correlation function of
these two envelopes, and then select the time length corresponding to
the maximal correlation as the delay.

I hope this method will help you, and if anyone has better approaches,
please let me know, thank you very much.

BR,  Junyong You

TUT, Finland
  ----- Original Message -----
  From: John Spencer
  To: AUDITORY@xxxxxxxxxxxxxxx
  Sent: Wednesday, July 02, 2008 2:11 PM
  Subject: [AUDITORY] Audio Time and/or Level Alignment Algorithm



  Hello List,



  I am looking for advice and help with a problem.



  I have 2 audio signals each recorded in different environments but
both are the same length. I need to align them the best I can or align
at leats one of them to match the other. They need to be aligned time
wise and level wise if possible. Any advice appreciated, thanks.



  John Spencer





------------------------------------------------------------------------------

  Walla! Mail - Get your free unlimited mail today


--
Dan Stowell
Centre for Digital Music
Queen Mary, University of London
Mile End Road, London E1 4NS
http://www.elec.qmul.ac.uk/department/staff/research/dans.htm
http://www.mcld.co.uk/



--
Dan Stowell
Centre for Digital Music
Queen Mary, University of London
Mile End Road, London E1 4NS
http://www.elec.qmul.ac.uk/department/staff/research/dans.htm
http://www.mcld.co.uk/

References:
- Audio Time and/or Level Alignment Algorithm
  - From: John Spencer
- Re: Audio Time and/or Level Alignment Algorithm
  - From: Junyong You
- Re: Audio Time and/or Level Alignment Algorithm
  - From: Dan Stowell

Prev by Date: Re: Audio Time and/or Level Alignment Algorithm
Next by Date: A problem about the relationship between perceived quality and loudness! thanks
Previous by thread: Re: Audio Time and/or Level Alignment Algorithm
Next by thread: Re: Audio Time and/or Level Alignment Algorithm
Index(es):
- Date
- Thread