[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AUDITORY] Synthesis Challenge at DCASE2023

Dear Auditory list:

Announcing a Foley synthesis challenge! 

It's a new part of the IEEE AASP Challenge on Detection and Classification 
of Acoustic Scenes and Events (Task 7 of DCASE). 
Call for entries is open, with a deadline of 15 May 2023. 


This task aims to build a Foley sound synthesis system that can generate 
plausible audio signals fitting into given categories of sound. Foley 
sound, in general, refers to sound effects that are created to convey 
(and sometimes enhance) the sounds produced by events occurring in a 
narrative (e.g. radio or film). Foley sounds are commonly added to 
multimedia to enhance the perceptual audio experience. This sound 
synthesis challenge requires the generation of original audio clips that 
represent a category of sound, such as footsteps. The new sounds should 
fit into the category that is typified by the set of sounds in the 
development set, yet they should not duplicate any of the provided 
sounds. Any synthesis approach is permitted (not just machine learning).

Why is this an important goal? First, time-consuming post-production is 
inevitable to obtain a perfectly matched sound effect. By generating 
sound that belongs to a target sound category, Foley sound synthesis can 
make the workflow much more time and cost-effective. With the rise of 
virtual environments such as the metaverse, we expect a growing need for 
the automated generation of more and more complex and creative sound 
environments. Second, it can be utilized for dataset synthesis or 
augmentation for a wide variety of DCASE tasks including sound event 
detection (SED). SED has drawn great attention and synthesized datasets 
have been used already, e.g., URBAN-SED dataset. A high-quality Foley 
sound synthesis model could lead to development of better SED models.

There are 7 categories of sound events to be synthesized. The challenge 
has two subproblems: the development of models with and without 
external resources. Participants are expected to submit a system for one 
of the two problems, and each problem is evaluated independently. 
Submissions will be evaluated by Frechet Audio Distance (FAD), followed 
by a subjective test.

Foley Challenge Organizers:
Keunwoo Choi, Gaudio Lab, Inc.; Korea
Jaekwon Im, Gaudio Lab, Inc., KAIST; Korea
Laurie M. Heller, Carnegie Mellon University; USA
Keisuke Imoto, Doshisha University; Japan
Mathieu Lagrange, CNRS, Ecole Centrale Nantes, Nantes University; France
Brian McFee, New York University; USA
Yuki Okamoto, Ritsumeikan University; Japan
Shinnosuke Takamichi, The University of Tokyo; Japan