Dear Auditory list:
Announcing a Foley synthesis challenge!
It's a new part of the IEEE AASP Challenge on Detection and Classification
of Acoustic Scenes and Events (Task 7 of DCASE).
Call for entries is open, with a deadline of 15 May 2023.
This task aims to build a Foley sound synthesis system that can generate
plausible audio signals fitting into given categories of sound. Foley
sound, in general, refers to sound effects that are created to convey
(and sometimes enhance) the sounds produced by events occurring in a
narrative (e.g. radio or film). Foley sounds are commonly added to
multimedia to enhance the perceptual audio experience. This sound
synthesis challenge requires the generation of original audio clips that
represent a category of sound, such as footsteps. The new sounds should
fit into the category that is typified by the set of sounds in the
development set, yet they should not duplicate any of the provided
sounds. Any synthesis approach is permitted (not just machine learning).
Why is this an important goal? First, time-consuming post-production is
inevitable to obtain a perfectly matched sound effect. By generating
sound that belongs to a target sound category, Foley sound synthesis can
make the workflow much more time and cost-effective. With the rise of
virtual environments such as the metaverse, we expect a growing need for
the automated generation of more and more complex and creative sound
environments. Second, it can be utilized for dataset synthesis or
augmentation for a wide variety of DCASE tasks including sound event
detection (SED). SED has drawn great attention and synthesized datasets
have been used already, e.g., URBAN-SED dataset. A high-quality Foley
sound synthesis model could lead to development of better SED models.
There are 7 categories of sound events to be synthesized. The challenge
has two subproblems: the development of models with and without
external resources. Participants are expected to submit a system for one
of the two problems, and each problem is evaluated independently.
Submissions will be evaluated by Frechet Audio Distance (FAD), followed
by a subjective test.
Foley Challenge Organizers:
Keunwoo Choi, Gaudio Lab, Inc.; Korea
Jaekwon Im, Gaudio Lab, Inc., KAIST; Korea
Laurie M. Heller, Carnegie Mellon University; USA
Keisuke Imoto, Doshisha University; Japan
Mathieu Lagrange, CNRS, Ecole Centrale Nantes, Nantes University; France
Brian McFee, New York University; USA
Yuki Okamoto, Ritsumeikan University; Japan
Shinnosuke Takamichi, The University of Tokyo; Japan