[AUDITORY] New tools & data for soundscape synthesis and online audio annotation

Subject: [AUDITORY] New tools & data for soundscape synthesis and online audio annotation

From: Justin Salamon <justin.salamon@xxxxxxx>

Date: Tue, 10 Oct 2017 11:56:46 -0400

Arc-authentication-results: i=1; mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.101 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-owner:list-subscribe:list-unsubscribe:list-help :precedence:to:comments:subject:from:sender:reply-to:date:message-id :mime-version:approved-by:arc-authentication-results; bh=4X5wAeRXKe3jaW4jFf418qUYfMLYMUwHjBpK9HoSGsM=; b=Kvx9lZ863kfYjVxpheBCBXpfEP0bcAGzE1liDlav48cN8AyCPnjPeREV9ySq23ntmE 70EB+nuSDKjRHMOBNOdWDZkvUDvaWSEaK6ZGFqEVGTCA8wDtrybnvoxKaoW+poMy+AUI jYsq/q9POBykYORftH7jzUz3FjD7pxWIRc5dryiJ7MBwPPPtCdrfDgOLTm2m0TADZlZk zGNhpwndKd6He2UVcEYhgAYPeCcooLsYnG6RB8aA+lkK3LYdyxzdy1WvOyohwYz7v2M7 SyLqCA/UQ4A/J4RbRht2qSHNmoXscxvjt1fmCS7BhU2CssOET5e7h7pvGX2o7+BT7LUA ewVw==

Arc-seal: i=1; a=rsa-sha256; t=1507696306; cv=none; d=google.com; s=arc-20160816; b=m6eVHMBurJHRAg6JRMWQEJ8TR67j8RzC+srQnF0ia9ThkvmuaGKr0UYFm8pzH5CNS0 KPweAGiPs5ok2E3TZBbNsetp8qrn5ytW5SCnKlPEX9YgjBZfUkqS0V/Gyh/emla/s2fF 4cWH+OQ9GOMbHKgGGVIFevNBz09987MVffHRyFZVKU+dMUTwT3//ys7WEKfrXkB3GAOp LNdhNbgi2G2uut0ZJ1sTv6EWDqMzxLloZM8lcMZ8zcbM2rPPNmnwU1QnCO4Yz55KHQBx ElwMM+Yx+5fHq7DL0kgkYSbjjN4MyHyeryxeseQ7dSzmGlV92ZwuDs503UsuMVn3mA2i Si/Q==

Authentication-results: mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.101 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx

Comments: cc: Mark Cartwright <mark.cartwright@xxxxxxx>

Delivered-to: dan.ellis@xxxxxxxxx

List-archive: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

List-help: <http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>

List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>

List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>

List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>

Reply-to: Justin Salamon <justin.salamon@xxxxxxx>

Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

*** apologies for cross-posting ***

Dear list,

We're glad to announce the release of two open-source tools and a new dataset developed as part of the SONYC project we hope will be of use to the community:

Scaper: a library for soundscape synthesis and augmentation

- Automatically synthesize soundscapes with corresponding ground truth annotations

- Useful for running controlled ML experiments (ASR, sound event detection, bioacoustic species recognition, etc.)

- Useful for running controlled experiments to assess human annotation performance

- Potentially useful for generating data for source separation experiments (might require some extra code)

- Potentially useful for generating ambisonic soundscapes (definitely requires some extra code)

AudioAnnotator: a _javascript_ web interface for annotating audio data

- Developed in collaboration with Edith Law and her students at the University of Waterloo's HCI Lab

- A web interface that allows users to annotate audio recordings

- Supports 3 types of visualization (waveform, spectrogram, invisible)

- Useful for crowdsourcing audio labels

- Useful for running controlled experiments on crowdsourcing audio labels

- Supports feedback mechanisms for providing real-time feedback to the user based on their annotations

URBAN-SED dataset: a new dataset for sound event detection

- Includes 10,000 soundscapes with strongly labeled sound events generated using scaper

- Totals almost 30 hours and includes close to 50,000 annotated sound events

- Baseline convnet results on URBAN-SED are included in the scaper-paper.

Further information about scaper, the AudioAnnotator and the URBAN-SED dataset, including controlled experiments on the quality of crowdsourced human annotations as a function of visualization and soundscape complexity, are provided in the following papers:

Seeing sound: Investigating the effects of visualizations and complexity on crowdsourced audio annotations

M. Cartwright, A. Seals, J. Salamon, A. Williams, S. Mikloska, D. MacConnell, E. Law, J. Bello, and O. Nov.

Proceedings of the ACM on Human-Computer Interaction, 1(2), 2017.

Scaper: A Library for Soundscape Synthesis and Augmentation

J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello.

In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.

We hope you find these tools and data useful and look forward to receiving your feedback (and pull requests!).

Cheers, on behalf of the entire team,

Justin Salamon & Mark Cartwright.

Justin Salamon, PhD

Senior Research Scientist

Music and Audio Research Laboratory (MARL)

& Center for Urban Science and Progress (CUSP)

New York University, New York, NY