[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Post-doctoral position on audiovisual speech separation (deadline Sept 1)
TITLE: Environment-robust audiovisual speech separation
RECRUITMENT DATE: as soon as possible between October 1 and December 1,
2011
DURATION: 18 months
SALARY: depending on experience
PRINCIPAL INVESTIGATOR: Nancy Bertin (nancy.bertin@xxxxxxxx)
CO-PRINCIPAL INVESTIGATOR: Emmanuel Vincent (emmanuel.vincent@xxxxxxxx)
DESCRIPTION OF THE PROJECT:
Speech separation is the task of estimating the signal of each speaker
within a recorded sound scene involving one or more speakers and
background noise. Existing approaches have typically been assessed in
specific environments, e.g. meeting environments involving concurrent
speakers and moderate reverberation or outdoor environments involving
diffuse background noise but no reverberation [1].
The purpose of this postdoctoral position is to propose a source
separation algorithm applicable to a wide range of environments and a
set of associated use case scenarios. In a first stage, an experimental
multi-environment benchmark will be developed and a number of
state-of-the-art algorithms will be evaluated. In a second stage, new
environment-robust algorithms will be investigated by designing improved
speaker and background noise models and integrating them into the
state-of-the-art variance modeling-based source separation framework
together with the available video information [2,3,4]. A range of
separation-related tasks, such as enhancement and denoising, will be
proposed and evaluated, so as to find the use case scenarios making best
use of the technology at hand.
[1] E. Vincent, S. Araki, F.J. Theis, G. Nolte, P. Bofill, H. Sawada, A.
Ozerov, B.V. Gowreesunker, D. Lutter, and N.Q.K. Duong, "The Signal
Separation Evaluation Campaign (2007-2010): Achievements and remaining
challenges", Technical Report RR-7581, INRIA, 2011.
[2] E. Vincent, M.G. Jafari, S.A. Abdallah, M.D. Plumbley, and M.E.
Davies, "Probabilistic modeling paradigms for audio source separation",
in Machine Audition: Principles, Algorithms and Systems, IGI Global, pp.
162-185, 2010.
[3] A. Ozerov, E. Vincent, and F. Bimbot, "A general flexible framework
for the handling of prior information in audio source separation",
Technical Report RR-7453, INRIA, 2010.
[4] A. Llagostera Casanovas, G. Monaci, P. Vandergheynst, and R.
Gribonval, "Blind audiovisual source separation based on sparse
redundant representations", IEEE Transactions on Multimedia, 12(5), pp.
358-371, 2010.
WORK ENVIRONMENT:
CNRS, the French National Center for Scientific Research, and INRIA, the
French National Institute for Research in Computer Science and Control,
both play a leading role in the development of Information Science and
Technology (IST) in Europe. The METISS team
(http://www.irisa.fr/metiss/) gathers a staff of 20 people focusing on
audio signal processing research within the joint CNRS/INRIA lab called
IRISA in Rennes.
This position is part of a collaborative project with Canon Research
Centre France (CRF) in nearby Cesson-Sévigné. It will involve regular
exchanges and collaboration with the Audio Research Team at CRF.
CANDIDATE PROFILE:
Prospective candidates must hold or be about to defend a PhD in audio
signal processing. Proficient coding in Matlab is necessary. Additional
expertise in audio benchmarking or source separation or audiovisual
processing would be an asset.
APPLICATION:
Applications including a full resume, a letter of motivation and up to
three reference letters must be sent by email to the principal
investigator before September 1, 2011. Phone interviews of selected
candidates will be held early September.