Subject: Re: 1. MP3 or AAC mixing in compressed/coded domain (2) From: Christian Borss <christian.borss@xxxxxxxx> Date: Tue, 3 Mar 2009 09:39:38 +0100 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>Am 02. Mär. 2009 um 10:46 Uhr schrieb Lorenzo Picinali: > It is plenty of audio editing and mixing software that can open MP3 or > AAC (AC3) files, but what they do is they all convert the compressed > files in an uncompressed PCM format, and then perform all the mixing and > editing operation, and convert them back to the compressed format after > the processing. This is fore sure a good method with short files, but > having for example a ten ours MP3 stereo recording, on a normal PCM > format would be nearly 10 Gb of data, therefore for mixing and editing > directly in MP3, in this case, could be a very good idea. The memory is not a constraint. Just use a stream-based audio processing software like sox which decodes only as much as needed for encoding. Here's a small example - the first two commands create differently encoded audio files of 4GB compressed PCM data, the third command generates a mixture without blowing up your memory: chris@xxxxxxxx:/data$ dd if=/dev/zero bs=1M count=4096 | sox -t raw -c 2 -r 44100 -s -2 - foo.mp3 4096+0 records in 4096+0 records out 4294967296 bytes (4.3 GB) copied, 512.311 s, 8.4 MB/s chris@xxxxxxxx:/data$ dd if=/dev/zero bs=1M count=4096 | sox -t raw -c 2 -r 44100 -s -2 - bar.ogg 4096+0 records in 4096+0 records out 4294967296 bytes (4.3 GB) copied, 1057.75 s, 4.1 MB/s chris@xxxxxxxx:/data$ sox -m foo.mp3 bar.ogg foobar.mp3 chris@xxxxxxxx:/data$ sox --version sox: SoX v14.2.0 > I'm not aware of any available software that directly mixes compressed > files, but for the editing you could use http://mpesch3.de1.cc/ , which > allows you to cut MP3 files and change the volume... Cutting MP3 files by leaving out / inserting whole MP3 frames is a trivial problem. One of the reasons why we use MP3 for the real-time audio streaming in our Internet-based interactive auditory virtual environment [1] is that the transmitted audio frames are self-contained. So, if single frames are lost, you can continue decoding the stream with the next received audio frame. The same principle can be used for cutting MP3 frames. Also changing the volume can easily be done by scaling the quantized frequency bins in the compressed domain. Mixing MP3 and AAC in the encoded domain is anything but trivial. MP3 and AAC use different transformations and different transformation lengths. That means you have different representations of differently overlapping time segments. How do you want to mix that?!? If you want to mess around with these problems, you have to have good reasons for doing so. In addition, the reencoding step which is commonly used in audio editing software ensures that the masking effect is properly taken into account for the mixture. Ciao, Christian [1] C. Borß, A. Silzle, and R. Martin. Internet-Based Interactive Auditory Virtual Environment Generators. In 14th Int. Conf. on Auditory Display, Paris, France, Jun. 2008. -- Christian Borß, Dipl.-Ing. || Institut für Kommunikationsakustik http://www.ika.ruhr-uni-bochum.de || Ruhr-Universität Bochum Tel.: +49-(0)234-32-22470 || Universitätsstr. 150, IC1/33 Fax.: +49-(0)234-32-14165 || D-44780 Bochum (Germany)