[AUDITORY] Announcing OpenL3, a competitive and open deep audio embedding! (Jason Cramer )

Subject: [AUDITORY] Announcing OpenL3, a competitive and open deep audio embedding! From: Jason Cramer <jtc440@xxxxxxxx> Date: Sat, 11 May 2019 08:49:06 -0400 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY> --0000000000009e197405889c1cd8 Content-Type: text/plain; charset="UTF-8" (Apologies for cross-posting) Hello everyone! We're excited to announce the release of OpenL3 <https://github.com/marl/openl3>, an open-source deep audio embedding based on the self-supervised L3-Net <https://deepmind.com/research/publications/look-listen-and-learn/>. OpenL3 is an improved version of L3-Net, and outperforms VGGish and SoundNet (and the original L3-Net) on several sound recognition tasks. Most importantly, OpenL3 is open source and readily available for everyone to use: if you have TensorFlow installed just run pip install openl3 and you're good to go! Full details are provided in our paper, which will be presented at ICASSP 2019: Look, Listen and Learn More: Design Choices for Deep Audio Embeddings <http://www.justinsalamon.com/uploads/4/3/9/4/4394963/cramer_looklistenlearnmore_icassp_2019.pdf> J. Cramer, H.-H. Wu, J. Salamon, and J. P. Bello. IEEE Int. Conf. on Acoustics, Speech and Signal Proc. (ICASSP), pp 3852-3856, Brighton, UK, May 2019. If you're attending ICASSP 2019 and would like to discuss OpenL3 with us please stop by our poster on Friday, May 17 between 13:30-15:30 (session MLSP-P17: Deep Learning V, Poster Area G, paper 2149). We look forward to seeing what the community does with OpenL3! Cheers, Jason Cramer, Ho-Hsiang Wu, Justin Salamon and Juan Pablo Bello. --0000000000009e197405889c1cd8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0)= ;background-color:transparent;font-weight:400;font-style:normal;font-varian= t:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"= >(Apologies for cross-posting) <span style=3D"font-size:11pt;= font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight= :400;font-style:normal;font-variant:normal;text-decoration:none;vertical-al= ign:baseline;white-space:pre-wrap">Hello everyone! <span styl= e=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:tra= nsparent;font-weight:400;font-style:normal;font-variant:normal;text-decorat= ion:none;vertical-align:baseline;white-space:pre-wrap">We're excited to= announce the release of<a href=3D"https://github.com/marl/openl3" s= tyle=3D"text-decoration:none"><span style=3D"font-size:11pt;font-family:Ari= al;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style= :normal;font-variant:normal;text-decoration:none;vertical-align:baseline;wh= ite-space:pre-wrap"> <span style=3D"font-size:11pt;font-family:Arial= ;color:rgb(17,85,204);background-color:transparent;font-weight:700;font-sty= le:normal;font-variant:normal;text-decoration:underline;vertical-align:base= line;white-space:pre-wrap">OpenL3</a><span style=3D"font-size:11pt;f= ont-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:= 400;font-style:normal;font-variant:normal;text-decoration:none;vertical-ali= gn:baseline;white-space:pre-wrap">, an open-source deep audio embedding bas= ed on the self-supervised<a href=3D"https://deepmind.com/research/pu= blications/look-listen-and-learn/" style=3D"text-decoration:none"><span sty= le=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:tr= ansparent;font-weight:400;font-style:normal;font-variant:normal;text-decora= tion:none;vertical-align:baseline;white-space:pre-wrap"> <span style= =3D"font-size:11pt;font-family:Arial;color:rgb(17,85,204);background-color:= transparent;font-weight:400;font-style:normal;font-variant:normal;text-deco= ration:underline;vertical-align:baseline;white-space:pre-wrap">L3-Net</span= ></a><span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0);backg= round-color:transparent;font-weight:400;font-style:normal;font-variant:norm= al;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">. Ope= nL3 is an improved version of L3-Net, and outperforms VGGish and SoundNet (= and the original L3-Net) on several sound recognition tasks. Most important= ly, <span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0)= ;background-color:transparent;font-weight:700;font-style:normal;font-varian= t:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"= >OpenL3 is<span style=3D"font-size:11pt;font-family:Arial;color:rgb(= 0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-= variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre= -wrap"> <span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,= 0,0);background-color:transparent;font-weight:700;font-style:normal;font-va= riant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-w= rap">open source<span style=3D"font-size:11pt;font-family:Arial;colo= r:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal= ;font-variant:normal;text-decoration:none;vertical-align:baseline;white-spa= ce:pre-wrap"> and readily available for everyone to use: if you have Tensor= Flow installed just run <span style=3D"font-size:11pt;font-family:Ar= ial;color:rgb(0,0,0);background-color:transparent;font-weight:700;font-styl= e:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;w= hite-space:pre-wrap">pip install openl3 <span style=3D"font-size:11p= t;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weig= ht:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-= align:baseline;white-space:pre-wrap">and you're good to go!<= br><span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0);backgro= und-color:transparent;font-weight:400;font-style:normal;font-variant:normal= ;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Full de= tails are provided in our paper, which will be presented at ICASSP 2019:</s= pan> <a href=3D"http://www.justinsalamon.com/uploads/4/3/9/4/4394963= /cramer_looklistenlearnmore_icassp_2019.pdf" style=3D"text-decoration:none"= ><span style=3D"font-size:11pt;font-family:Arial;color:rgb(17,85,204);backg= round-color:transparent;font-weight:400;font-style:normal;font-variant:norm= al;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">= Look, Listen and Learn More: Design Choices for Deep Audio Embeddings</span= ></a><span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0);b= ackground-color:transparent;font-weight:400;font-style:normal;font-variant:= normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">J= . Cramer, H.-H. Wu, J. Salamon, and J. P. Bello.<span style=3D"f= ont-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transpare= nt;font-weight:400;font-style:normal;font-variant:normal;text-decoration:no= ne;vertical-align:baseline;white-space:pre-wrap">IEEE Int. Conf. on Acousti= cs, Speech and Signal Proc. (ICASSP), pp 3852-3856, Brighton, UK, May 2019.= <span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,= 0,0);background-color:transparent;font-weight:400;font-style:normal;font-va= riant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-w= rap">If you're attending ICASSP 2019 and would like to discuss OpenL3 w= ith us please stop by our poster on Friday, May 17 between 13:30-15:30 (ses= sion MLSP-P17: Deep Learning V, Poster Area G, paper 2149). <= p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><= span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-= color:transparent;font-weight:400;font-style:normal;font-variant:normal;tex= t-decoration:none;vertical-align:baseline;white-space:pre-wrap">We look for= ward to seeing what the community does with OpenL3! <span sty= le=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:tr= ansparent;font-weight:400;font-style:normal;font-variant:normal;text-decora= tion:none;vertical-align:baseline;white-space:pre-wrap">Cheers,<= p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><= span style=3D"font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-= color:transparent;font-weight:400;font-style:normal;font-variant:normal;tex= t-decoration:none;vertical-align:baseline;white-space:pre-wrap">Jason Crame= r, Ho-Hsiang Wu, Justin Salamon and Juan Pablo Bello.</div> --0000000000009e197405889c1cd8--

This message came from the mail archive
src/postings/2019/
maintained by:

DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University