[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Reconstructing faces from voices

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: [AUDITORY] Reconstructing faces from voices
From: PIerre DIVENYI <pdivenyi@xxxxxxxxxxxxxxxxxx>
Date: Wed, 29 May 2019 08:05:54 -0700
Approved-by: pdivenyi@xxxxxxxxxxxxxxxxxx
Arc-authentication-results: i=1; mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.101 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx
Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-owner:list-subscribe:list-unsubscribe:list-help :precedence:in-reply-to:to:comments:subject:from:sender:reply-to :date:message-id:references:content-transfer-encoding:mime-version :approved-by; bh=6I85Ab97t97AjVtzT3iCcWzRpXkHJQOUYbO9qDZ4MpM=; b=Azo73GDIeYyRMQxPmXLbLiR4FltSpFipsMZ7DWUcRYdoB6TLVdKFrcMMBScb3hIWZh v9G1DIs5oOctk+JKssiBEmVQ24i7jECMo1dljBWbnaqkb0PQTCSk0RgdbPinnNYEHRKn nc2yjt3J1xbmC2ceJtTr/Irv+AUaIGSJhWbSDxbIisEF744Aqq6+alxpzioVXPAV86Fu +0E73AS1KtnNqmRSDJB72wC2P/DPRgbEMzvY/u2gw5FkjkVF11x4GR1ioXDJhJSNIEnp +ekPo51M1HH/ZLQX/m+SKeRobgKJ8zAHuxDpdSWimt+4+lkrF2Z5kVM2CImbScbBgwIU dEPA==
Arc-seal: i=1; a=rsa-sha256; t=1559189767; cv=none; d=google.com; s=arc-20160816; b=ufMLsBiwFEc8ZkqcQMjkrwuW7VX/2PrtHjApYiNSI5zoGKcu0/58C08kbiFlkmFt4D h5M959X9SnC+iOccu1+nv0c6t3atpgJi52nKxM2QR/qjw0Wf0ux3oqAmB83JRZ2dPjMD +XJ8jjHsSjqw9AniCCb06XFDHlXC2eOW0a1nVPvAb6+071EWxMdTqOu294H4OdkLg54J rctapWoAK+R0oGcrGErTac9cePSs8xQgzLbymcxfAulKDzqDGlIALh8STp6RIMfXCEbs R/4up1D9RVPkgicXbmW4YmFBmfWhpmWlq55vEMfNzUIxojlzKzh3LBU0MxBzUtx0+YMM 4vaw==
Authentication-results: mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.101 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx
Comments: To: bhiksha raj <bhiksha@xxxxxxxxx>
Delivered-to: dan.ellis@xxxxxxxxx
In-reply-to: <CACKXNk5qc23CeP48RtXbxHjciJQ00uiRi87QGGu7PLe5+4P_gw@mail.gmail.com>
List-archive: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>
List-help: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>
List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>
List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>
List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>
References: <CACKXNk5qc23CeP48RtXbxHjciJQ00uiRi87QGGu7PLe5+4P_gw@mail.gmail.com>
Reply-to: PIerre DIVENYI <pdivenyi@xxxxxxxxxxxxxxxxxx>
Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

Holy cow, Raj! Will be following this...
Pierre

Sent from my autocorrecting iPad

> On May 28, 2019, at 04:27, bhiksha raj <bhiksha@xxxxxxxxx> wrote:
> 
> Dear All
> 
> Thought you might be interested in this work by Yandong Wen and Rita
> Singh at CMU.
> (I'm a co-author by virtue of being in an advisory role)
> 
> https://arxiv.org/abs/1905.10604
> 
> Abstract:  Voice profiling aims at inferring various human parameters
> from their speech, e.g. gender, age, etc. In this paper, we address
> the challenge posed by a subtask of voice profiling - reconstructing
> someone's face from their voice. The task is designed to answer the
> question: given an audio clip spoken by an unseen person, can we
> picture a face that has as many common elements, or associations as
> possible with the speaker, in terms of identity? To address this
> problem, we propose a simple but effective computational framework
> based on generative adversarial networks (GANs). The network learns to
> generate faces from voices by matching the identities of generated
> faces to those of the speakers, on a training set. We evaluate the
> performance of the network by leveraging a closely related task -
> cross-modal matching. The results show that our model is able to
> generate faces that match several biometric characteristics of the
> speaker, and results in matching accuracies that are much better than
> chance.
> 
> best
> Bhiksha
> 
> -- 
> Bhiksha Raj
> Carnegie Mellon University
> Pittsburgh, PA, USA
> Tel: 412 268 9826
>

References:
- [AUDITORY] Reconstructing faces from voices
  - From: bhiksha raj

Prev by Date: [AUDITORY] FFR workshop: registration deadline approaching
Next by Date: [AUDITORY] External class compliant sound cards
Previous by thread: [AUDITORY] Reconstructing faces from voices
Next by thread: Re: [AUDITORY] Reconstructing faces from voices
Index(es):
- Date
- Thread