[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 1. MP3 or AAC mixing in compressed/coded domain (2)

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: 1. MP3 or AAC mixing in compressed/coded domain (2)
From: Dan Ellis <dpwe@xxxxxxxxxxxxxxx>
Date: Tue, 3 Mar 2009 07:36:17 -0500
Approved-by: dpwe@xxxxxxxxxxxxxxx
Comments: To: ydwang@xxxxxxxxxxx
Delivery-date: Tue Mar 3 09:26:06 2009
In-reply-to: <20090303072049.E7CAB282A@xxxxxxxxxxxxxxxxxxxxxxx>
List-archive: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>
List-help: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO AUDITORY>
List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>
List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>
List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>
References: <AUDITORY%200903020012556550.F6DE@xxxxxxxxxxxxxxx> <20090302105534.F1C3D5DAA@xxxxxxxxxxxxxxxxxxxxxxx> <20090302164741.E9BAB5A05@xxxxxxxxxxxxxxxxxxxxxxx> <20090303072049.E7CAB282A@xxxxxxxxxxxxxxxxxxxxxxx>
Reply-to: Dan Ellis <dpwe@xxxxxxxxxxxxxxx>
Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

> I am asking about mixing directly in compressed format.
>
> some reasons to pursue that:
>
> 1. each time you uncompress edit and re-compress, you lose audio quality.

You lose audio quality because the original signal has been quantized
in its MP3 representation, which is equivalent to adding some random
offsets to each sample.  If you decode and re-encode, you potentially
re-quantize to a different quantization level, adding still more
random offsets.  If you stay in the quantized domain, you can avoid
that.  But if you are actually changing the waveform (by adding
signals together), you cannot avoid changing the quantization levels,
so I think the quality loss is unavoidable - the damage was actually
done when the original signal was quantized to MP3, and you can't undo
that.

> 2. it's take too long to re-encode mp3 file after mixing in pcm format,
> which is unacceptable for some real-time (or close to real-time)
> application.

I'm not sure exactly how the compute time breaks down in an MP3
encoder, but the big thing that is slower in an encoder vs. a decoder
is that it has to recompute the psychoacoustic masking and bit
allocation.  Again, if you change the actual subband signals, and if
you wish to preserve good psychoacoustic masking, you will have to
re-run these stages.

Compressed-domain processing should allow you to avoid the frequency
transforms (initial polyphase filterbank and subsequent MDCT), since
the subband representation is still linear even after those initial
stages.  However, if you want to have the long/short MDCT window
switching properly done for your new signal, you will need to redo the
MDCT stage too.  So I think the polyphase filter is the only part you
can save without compromising encoding quality, which is probably less
than a quarter of the total processing.

If you're prepared to sacrifice audio quality, there is probably
something hacky you could do that would be much quicker, like choosing
each quantized subband from only one of the two signals depending on
which had greater energy in that band, which would give you a kind of
mixture.  But you'd still run into trouble if they had different
short/long MDCT windows in any particular frame.

  DAn.

Prev by Date: Re: 1. MP3 or AAC mixing in compressed/coded domain (2)
Next by Date: Loudspeaker with directivity similar to a human talker
Previous by thread: Re: 1. MP3 or AAC mixing in compressed/coded domain (2)
Next by thread: 2nd call for the 2nd European career workshop for PhD candidates in hearing research and acoustics
Index(es):
- Date
- Thread