[AUDITORY] Synthesis Challenge at DCASE2023

Subject: [AUDITORY] Synthesis Challenge at DCASE2023

From: Laurie Heller <hellerl@xxxxxxxxxxxxxx>

Date: Fri, 3 Mar 2023 09:10:21 -0500

Arc-authentication-results: i=1; mx.google.com; dkim=pass header.i=@LISTS.MCGILL.CA header.s=SELECTOR1 header.b=XhHikZVe; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.104 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=andrew.cmu.edu

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-owner:list-subscribe:list-unsubscribe:list-help :precedence:to:subject:from:sender:reply-to:date:message-id :mime-version:approved-by:dkim-signature; bh=J8IezX1gkDYnwRyu87DO7+NBRrnZBEGQx0jnSsOz1Yw=; b=W9KLdv4b0G9dl4OxgNujijEuwSbi1GuV0QWvXXi5R9wwKlgIghwfnY63S87T2kQ7AJ DvdyWv9I0zZtuu/zZkseAifkznCp//6AppxKef3yQOjqyKyToevIS+X2dHMMnZHBRZHb LvF1kMpVuRo4xiTu/NO87Ctz3XC6FycHe/DjbbZYrC7u569R7LveiiwehDYGmVXxOxAJ kkpGrPIOkE7xGJW653XqEoNFKhSCg1IoC7go4Lw6sCMvvoI5RaOQgfm9S7snd1FUZNlA B4/LUyHAt1Z//+QcGb3ECk3XNsj5YjcYjK6N3Gc2i7ynDFAbj9Axch+OMURPYN1vs2z9 U6vw==

Arc-seal: i=1; a=rsa-sha256; t=1677906804; cv=none; d=google.com; s=arc-20160816; b=I+oCcdcuUW6GX79zXTIxyLITjgGk3BD8xo2LLBuDK2/Uzr0K/oUMf60UruRgvFQ7D5 X9z9t7X6kFLmadKevXiK/ZR/7mcmfe45aMKh1wDf7WXDCuQFAoaiaZRM7fYBRq+jxLPK GyMnsvZ6+2WtuUDkOXqicLIYCHcvToTLlKinsUt9+CW/PKEUE+aG9jN9v2LUHI/fHDlv unC+HRCW3CV+kTaalDYqZbYTQ3FwcOw8dAS6mWwZAQCnlDGECBOR8ZPX3DWG9vq4MsjY Y70m/xxyiOjIRjh0AdeHWmJklD+FMzVEDnRCkKb4u/r+ngaEn2h/XZs/qFdUhV8GKqD/ nKVw==

Authentication-results: mx.google.com; dkim=pass header.i=@LISTS.MCGILL.CA header.s=SELECTOR1 header.b=XhHikZVe; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.104 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=andrew.cmu.edu

Delivered-to: dan.ellis@xxxxxxxxx

Dkim-signature: v=1; a=rsa-sha256; d=LISTS.MCGILL.CA; s=SELECTOR1; c=relaxed/relaxed; bh=J8IezX1gkDYnwRyu87DO7+NBRrnZBEGQx0jnSsOz1Yw=; i=@LISTS.MCGILL.CA; h=Approved-By:Content-Type:Mime-Version:Message-ID:Date:Reply-To:Sender:From:Subject:To:List-Help:List-Unsubscribe:List-Subscribe:List-Owner:List-Archive; b=XhHikZVeAmrEDzPxHNRhcKMvuRbNbtNiXvKizLCf6mYYWNn5zhq+pGBH8pAu28OV3srT+YMYA3NQWXtTYaBeCzWrCZikuCpQP3/kRy+lQyagg4PGkPAi/SENUP2qNRHyZe4XJZJa1Pve9ZNcXZ37S1gP48CejoBUQ+eb4SxS0cJpL+HmIQ+Advp3IoL0S/LEOQDMiGCdxkvN1bgZr2kL3RGcW0M2bTbP2Wt9vIdTeWDAsSv0FNOf6XKtcCvnLlDPVkLHCqrfYwHi2Gtx+i1FhRbtfg/55YN3s4t8bao5zOuCsET9GWHBc7c9FjCjUwP38M3Hhy3c0u0DX0yL/RS3rw==

List-archive: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

List-help: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>

List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>

List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>

List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>

Reply-to: Laurie Heller <hellerl@xxxxxxxxxxxxxx>

Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

Dear Auditory list:

Announcing a Foley synthesis challenge! 

It's a new part of the IEEE AASP Challenge on Detection and Classification 

of Acoustic Scenes and Events (Task 7 of DCASE). 

Call for entries is open, with a deadline of 15 May 2023. 

https://dcase.community/challenge2023/task-foley-sound-synthesis

This task aims to build a Foley sound synthesis system that can generate 

plausible audio signals fitting into given categories of sound. Foley 

sound, in general, refers to sound effects that are created to convey 

(and sometimes enhance) the sounds produced by events occurring in a 

narrative (e.g. radio or film). Foley sounds are commonly added to 

multimedia to enhance the perceptual audio experience. This sound 

synthesis challenge requires the generation of original audio clips that 

represent a category of sound, such as footsteps. The new sounds should 

fit into the category that is typified by the set of sounds in the 

development set, yet they should not duplicate any of the provided 

sounds. Any synthesis approach is permitted (not just machine learning).

Why is this an important goal? First, time-consuming post-production is 

inevitable to obtain a perfectly matched sound effect. By generating 

sound that belongs to a target sound category, Foley sound synthesis can 

make the workflow much more time and cost-effective. With the rise of 

virtual environments such as the metaverse, we expect a growing need for 

the automated generation of more and more complex and creative sound 

environments. Second, it can be utilized for dataset synthesis or 

augmentation for a wide variety of DCASE tasks including sound event 

detection (SED). SED has drawn great attention and synthesized datasets 

have been used already, e.g., URBAN-SED dataset. A high-quality Foley 

sound synthesis model could lead to development of better SED models.

There are 7 categories of sound events to be synthesized. The challenge 

has two subproblems: the development of models with and without 

external resources. Participants are expected to submit a system for one 

of the two problems, and each problem is evaluated independently. 

Submissions will be evaluated by Frechet Audio Distance (FAD), followed 

by a subjective test.

#foleysynthesischallenge

Foley Challenge Organizers:

Keunwoo Choi, Gaudio Lab, Inc.; Korea

Jaekwon Im, Gaudio Lab, Inc., KAIST; Korea

Laurie M. Heller, Carnegie Mellon University; USA

Keisuke Imoto, Doshisha University; Japan

Mathieu Lagrange, CNRS, Ecole Centrale Nantes, Nantes University; France

Brian McFee, New York University; USA

Yuki Okamoto, Ritsumeikan University; Japan

Shinnosuke Takamichi, The University of Tokyo; Japan