[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AUDITORY] AVA Speech dataset now available



Hi Everyone,
                     I'm happy to announce the release of a new dataset, AVA Speech, which provides speech activity labels for v1.0 of the AVA dataset:

– It contains densely annotated labels indicating when speech is present, as well as annotating the background condition: whether it was clean speech, speech with background music or speech with background noise. Multiple raters annotated every instant of each of the 15-minute clips, and the ratings were merged using a majority vote to obtain the final set of labels which have been released.

– The dataset is available on the AVA Download page.

– This work is described in more detail in our paper (available on arxiv here) which will be presented at Interspeech 2018 on September 4. In addition to the data itself, the paper provides baseline performance numbers for speech detection performance in the various conditions, using audio-only and visual-only systems.

– Please use the ava-dataset-users Google group for discussions and questions around the dataset, and please feel free to forward this note to relevant lists.

Regards,
 Sourish Chaudhuri
Google AI Perception