Re: [AUDITORY] Reconstructing faces from voices (bhiksha raj )

Subject: Re: [AUDITORY] Reconstructing faces from voices From: bhiksha raj <bhiksha@xxxxxxxx> Date: Wed, 12 Jun 2019 11:17:49 -0400 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY> Just a follow up. Yandong has his code up here: https://github.com/cmu-mlsp/reconstructing_faces_from_voices On Tue, May 28, 2019 at 7:27 AM bhiksha raj <bhiksha@xxxxxxxx> wrote: > > Dear All > > Thought you might be interested in this work by Yandong Wen and Rita > Singh at CMU. > (I'm a co-author by virtue of being in an advisory role) > > https://arxiv.org/abs/1905.10604 > > Abstract: Voice profiling aims at inferring various human parameters > from their speech, e.g. gender, age, etc. In this paper, we address > the challenge posed by a subtask of voice profiling - reconstructing > someone's face from their voice. The task is designed to answer the > question: given an audio clip spoken by an unseen person, can we > picture a face that has as many common elements, or associations as > possible with the speaker, in terms of identity? To address this > problem, we propose a simple but effective computational framework > based on generative adversarial networks (GANs). The network learns to > generate faces from voices by matching the identities of generated > faces to those of the speakers, on a training set. We evaluate the > performance of the network by leveraging a closely related task - > cross-modal matching. The results show that our model is able to > generate faces that match several biometric characteristics of the > speaker, and results in matching accuracies that are much better than > chance. > > best > Bhiksha > > -- > Bhiksha Raj > Carnegie Mellon University > Pittsburgh, PA, USA > Tel: 412 268 9826 -- Bhiksha Raj Carnegie Mellon University Pittsburgh, PA, USA Tel: 412 268 9826

This message came from the mail archive
src/postings/2019/
maintained by:

DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University