[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Postdoc position at IRISA, Rennes, France



Dear list,

We are seeking to recruit a postdoctoral researcher on adaptive spectral modeling of audio signals, applied to source separation and object-based sound scene description (full subject below). The successful candidate will work in the METISS group at the public research institute IRISA under the supervision of Drs. Emmanuel Vincent and Rémi Gribonval.

Prospective candidates should have a background in signal processing and hold a PhD from may 2007 or after or be about to obtain one. Additional knowledge about audio is appreciated. Informal enquiries may be made to Emmanuel Vincent (emmanuel.vincent@xxxxxxxx).

This appointment will start fall 2008 and salary will be at 28000 euros per annum. Applications must be submitted online before february 15th at
http://www.inria.fr/travailler/mrted/en/postdoc/details.html?nPostingTargetID=4835




Semi-adaptive spectrum models for audio signal processing

Most audio signals are mixtures of several sources present at the same time: speakers, musical instruments, natural sounds. The modeling of these sources is the core problem behind many audio signal processing tasks, such as source separation, content classification and speech/music transcription. The most general family of models consists of representing the short-term power spectrum of each source as a linear combination of basis spectra learned on single-source training data. The adaptation of these models to the considered mixture is bound to improve performance. However existing adaptation techniques often fail, due to the large number of free parameters per source.

Our team has recently started to investigate a new adaptation paradigm for such models, whereby generic constraints satisfied by a range of audio sources are specified on the model parameters. This reduces the number of free parameters, hence improving the quality of adaptation and removing the need for single-source training data. For instance, harmonicity can be enforced by representing each basis spectrum as a linear combination of fixed spectra each representing a few adjacent harmonic partials. The spectral envelope of each basis spectrum is then learned by adapting the combination weights.

The first goal of this postdoc is to validate this paradigm in a fully general context by designing and testing a larger set of appropriate constraints. Possible constraints include: inharmonicity or wideband character of the basis spectra, source-filter model of the spectral envelope, transient or continuous character of the combination weight sequences. A second goal is to propose a way of exploiting the available spatial information for model adaptation. A promising approach consists of modeling and tracking the interchannel intensity difference of each source, in addition to its power spectrum.

The proposed paradigm is expected to have a high impact on the processing of audio data within large databases, where single-source training data are typically unavailable due to the huge range of possible sources. The results will be primarily evaluated for the task of source separation on a large range of audio mixtures, including speech, music and natural sound scenes. Depending on the research background of the applicant, additional tasks will be considered such as temporal decomposition of speech, multiple pitch estimation of music or sound object identification within natural sound scenes.


-- Emmanuel Vincent METISS Project-Team IRISA-INRIA Campus de Beaulieu, 35042 Rennes cedex, France Phone: +332 9984 2269 - Fax: +332 9984 7171 Web: http://www.irisa.fr/metiss/members/evincent/