[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Post-doctoral position on audiovisual speech separation (deadline Sept 1)
TITLE: Environment-robust audiovisual speech separation
RECRUITMENT DATE: as soon as possible between October 1 and December 1, 
2011
DURATION: 18 months
SALARY: depending on experience
PRINCIPAL INVESTIGATOR: Nancy Bertin (nancy.bertin@xxxxxxxx)
CO-PRINCIPAL INVESTIGATOR: Emmanuel Vincent (emmanuel.vincent@xxxxxxxx)
DESCRIPTION OF THE PROJECT:
Speech separation is the task of estimating the signal of each speaker 
within a recorded sound scene involving one or more speakers and 
background noise. Existing approaches have typically been assessed in 
specific environments, e.g. meeting environments involving concurrent 
speakers and moderate reverberation or outdoor environments involving 
diffuse background noise but no reverberation [1].
The purpose of this postdoctoral position is to propose a source 
separation algorithm applicable to a wide range of environments and a 
set of associated use case scenarios. In a first stage, an experimental 
multi-environment benchmark will be developed and a number of 
state-of-the-art algorithms will be evaluated. In a second stage, new 
environment-robust algorithms will be investigated by designing improved 
speaker and background noise models and integrating them into the 
state-of-the-art variance modeling-based source separation framework 
together with the available video information [2,3,4]. A range of 
separation-related tasks, such as enhancement and denoising, will be 
proposed and evaluated, so as to find the use case scenarios making best 
use of the technology at hand.
[1] E. Vincent, S. Araki, F.J. Theis, G. Nolte, P. Bofill, H. Sawada, A. 
Ozerov, B.V. Gowreesunker, D. Lutter, and N.Q.K. Duong, "The Signal 
Separation Evaluation Campaign (2007-2010): Achievements and remaining 
challenges", Technical Report RR-7581, INRIA, 2011.
[2] E. Vincent, M.G. Jafari, S.A. Abdallah, M.D. Plumbley, and M.E. 
Davies, "Probabilistic modeling paradigms for audio source separation", 
in Machine Audition: Principles, Algorithms and Systems, IGI Global, pp. 
162-185, 2010.
[3] A. Ozerov, E. Vincent, and F. Bimbot, "A general flexible framework 
for the handling of prior information in audio source separation", 
Technical Report RR-7453, INRIA, 2010.
[4] A. Llagostera Casanovas, G. Monaci, P. Vandergheynst, and R. 
Gribonval, "Blind audiovisual source separation based on sparse 
redundant representations", IEEE Transactions on Multimedia, 12(5), pp. 
358-371, 2010.
WORK ENVIRONMENT:
CNRS, the French National Center for Scientific Research, and INRIA, the 
French National Institute for Research in Computer Science and Control, 
both play a leading role in the development of Information Science and 
Technology (IST) in Europe. The METISS team 
(http://www.irisa.fr/metiss/) gathers a staff of 20 people focusing on 
audio signal processing research within the joint CNRS/INRIA lab called 
IRISA in Rennes.
This position is part of a collaborative project with Canon Research 
Centre France (CRF) in nearby Cesson-Sévigné. It will involve regular 
exchanges and collaboration with the Audio Research Team at CRF.
CANDIDATE PROFILE:
Prospective candidates must hold or be about to defend a PhD in audio 
signal processing. Proficient coding in Matlab is necessary. Additional 
expertise in audio benchmarking or source separation or audiovisual 
processing would be an asset.
APPLICATION:
Applications including a full resume, a letter of motivation and up to 
three reference letters must be sent by email to the principal 
investigator before September 1, 2011. Phone interviews of selected 
candidates will be held early September.