[AUDITORY] Releasing the test set of FSDKaggle2019 dataset (used in DCASE 2019 Task2)

Subject: [AUDITORY] Releasing the test set of FSDKaggle2019 dataset (used in DCASE 2019 Task2)

From: Eduardo Fonseca <eduardo.fonseca@xxxxxxx>

Date: Fri, 24 Jan 2020 18:40:16 +0100

Arc-authentication-results: i=1; mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.102 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=upf.edu

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-owner:list-subscribe:list-unsubscribe:list-help :precedence:to:subject:from:sender:reply-to:date:message-id :mime-version:approved-by; bh=zTIWEiKb9vElRMXa/c5ZQV+NfRa+navWgTW6D805lw8=; b=biUQTfX3f+71QsxaBVtP9x8GlyiKaAOr0xl89cMTGmjwTHuuSLvojK+MXEXvreZPbw 6YwaPGbcQDU2L8FWhSezrz8LthaBi67xgMNvbS/fk+OZobWNzLolKcDOpQQ9iR6kTKvT RI2T6bQR2XXplmcpAO5ZKrrAeRMItn5qQClS8Iu0l6NWYyQgX3fAvG6YLWGuI2L1BBG2 gsRywHfCv5GCS+ffyTFxUepbZiyt/mbsKYm2OfpSes8gOdVEp2E1L27gQx50XnDXbQ8U qdNzCj1AG+kFw1My3Cm5HPbOTzz62rwU2W2y/pfdEPyla52/Qp0PYGwdsi5T+2wdwLqk mpBA==

Arc-seal: i=1; a=rsa-sha256; t=1579929531; cv=none; d=google.com; s=arc-20160816; b=LoBNXIgp/VYHUBqmcYOPEs4GdN7IehcycLWcrbEMtQS6Nou1xcvK0TyeYsHjA50kp/ NZouHrq4BpsvSqeB8Xv/8dj2Td6TZHFqY0bS+ucQbPn47nfgA9a7KXBx+1qHvtXazGDs 7t9k6rJvxDOEaHRujl2PN9UZE4qTOqGwl3riqO6VmygZMCTx9iKFyEY5cmAv0akdNzAu 9iT5SXSXxoUyWaMdopLQEfF4fdepa2akU0zuuNwl+ywTOhgRKu/Z1H5S+8u7p2Wn7hSu rz1MqRuehmeG5IW2+iK1FkkzzD763VtHLMBLuIQAuEOFrwsQrtklId+6gc5xQTh7d692 FH7g==

Authentication-results: mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.102 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=upf.edu

Delivered-to: dan.ellis@xxxxxxxxx

List-archive: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

List-help: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>

List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>

List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>

List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>

Reply-to: Eduardo Fonseca <eduardo.fonseca@xxxxxxx>

Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

=== Apologies for cross-posting ===

Dear list,

We’re glad to announce we have released the full test set & labels of FSDKaggle2019. This dataset was used for DCASE 2019 Task 2, which was hosted on the Kaggle platform as a competition titled Freesound Audio Tagging 2019.

FSDKaggle2019 includes almost 30k audio clips amounting over 100h of audio, encompassing 80 classes drawn from the AudioSet Ontology. It includes a human curated train set from Freesound (~5k clips, ~11h), a noisy train set from Flickr (~20k clips, ~80h), and a test set from Freesound (~4.5k clips, ~13h). The dataset allows development and evaluation of machine listening methods in conditions of label noise, minimal supervision, and real-world acoustic mismatch.

FSDKaggle2019 is freely available from Zenodo: https://doi.org/10.5281/zenodo.3612637 
You can find more details in our DCASE 2019 paper: E. Fonseca, M. Plakal, F. Font, D. P. W. Ellis, and X. Serra. Audio tagging with noisy labels and minimal supervision. Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, NYC, USA, 2019

Both competition and dataset have been a collaboration between the Music Technology Group of Universitat Pompeu Fabra, and the Sound Understanding team at Google AI Perception. This effort was kindly sponsored by a Google Faculty Research Award 2018.

Best,

Eduardo, Manoj, Frederic, Dan and Xavier

Eduardo Fonseca
Music Technology Group
Universitat Pompeu Fabra