[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AUDITORY] Reconstructing faces from voices



Dear All

Thought you might be interested in this work by Yandong Wen and Rita
Singh at CMU.
(I'm a co-author by virtue of being in an advisory role)

https://arxiv.org/abs/1905.10604

Abstract:  Voice profiling aims at inferring various human parameters
from their speech, e.g. gender, age, etc. In this paper, we address
the challenge posed by a subtask of voice profiling - reconstructing
someone's face from their voice. The task is designed to answer the
question: given an audio clip spoken by an unseen person, can we
picture a face that has as many common elements, or associations as
possible with the speaker, in terms of identity? To address this
problem, we propose a simple but effective computational framework
based on generative adversarial networks (GANs). The network learns to
generate faces from voices by matching the identities of generated
faces to those of the speakers, on a training set. We evaluate the
performance of the network by leveraging a closely related task -
cross-modal matching. The results show that our model is able to
generate faces that match several biometric characteristics of the
speaker, and results in matching accuracies that are much better than
chance.

best
Bhiksha

-- 
Bhiksha Raj
Carnegie Mellon University
Pittsburgh, PA, USA
Tel: 412 268 9826