[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AUDITORY] Announcing OpenL3 v0.3.0: now supporting audio AND image embeddings (and more!)



(Apologies for cross-posting)

Hello everyone!

We're excited to announce the release of version 0.3.0 of OpenL3, an open-source deep audio embedding based on the self-supervised L3-Net. As a reminder, OpenL3 is an improved version of L3-Net, and outperforms VGGish and SoundNet (and the original L3-Net) on several sound recognition tasks.

In this latest version, we have added functionality extracting image embeddings, processing video files, and batch processing. OpenL3 is open source and readily available for everyone to use: if you have TensorFlow installed just run pip install openl3 and you're good to go!

Full details are provided in our paper:

Look, Listen and Learn More: Design Choices for Deep Audio Embeddings
J. Cramer, H.-H. Wu, J. Salamon, and J. P. Bello.
IEEE Int. Conf. on Acoustics, Speech and Signal Proc. (ICASSP), pp 3852-3856, Brighton, UK, May 2019.

Cheers,
Jason Cramer, Ho-Hsiang Wu, Justin Salamon and Juan Pablo Bello.