Re: [AUDITORY] Reconstructing faces from voices (PIerre DIVENYI )


Subject: Re: [AUDITORY] Reconstructing faces from voices
From:    PIerre DIVENYI  <pdivenyi@xxxxxxxx>
Date:    Wed, 29 May 2019 08:05:54 -0700
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

Holy cow, Raj! Will be following this... Pierre Sent from my autocorrecting iPad > On May 28, 2019, at 04:27, bhiksha raj <bhiksha@xxxxxxxx> wrote: > > Dear All > > Thought you might be interested in this work by Yandong Wen and Rita > Singh at CMU. > (I'm a co-author by virtue of being in an advisory role) > > https://arxiv.org/abs/1905.10604 > > Abstract: Voice profiling aims at inferring various human parameters > from their speech, e.g. gender, age, etc. In this paper, we address > the challenge posed by a subtask of voice profiling - reconstructing > someone's face from their voice. The task is designed to answer the > question: given an audio clip spoken by an unseen person, can we > picture a face that has as many common elements, or associations as > possible with the speaker, in terms of identity? To address this > problem, we propose a simple but effective computational framework > based on generative adversarial networks (GANs). The network learns to > generate faces from voices by matching the identities of generated > faces to those of the speakers, on a training set. We evaluate the > performance of the network by leveraging a closely related task - > cross-modal matching. The results show that our model is able to > generate faces that match several biometric characteristics of the > speaker, and results in matching accuracies that are much better than > chance. > > best > Bhiksha > > -- > Bhiksha Raj > Carnegie Mellon University > Pittsburgh, PA, USA > Tel: 412 268 9826 >


This message came from the mail archive
src/postings/2019/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University