[AUDITORY] Call for participation to the ICME2025 Audio Encoder Challenge (Wenwu Wang )


Subject: [AUDITORY] Call for participation to the ICME2025 Audio Encoder Challenge
From:    Wenwu Wang  <000000615c5e5fae-dmarc-request@xxxxxxxx>
Date:    Thu, 27 Feb 2025 22:17:07 +0000

--_000_PAXPR06MB7422E1D597C23F628872FB9DBACD2PAXPR06MB7422eurp_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable ICME2025-Audio-Encoder-Challenge<https://dataoceanai.github.io/ICME2025-Aud= io-Encoder-Challenge/> https://dataoceanai.github.io/ICME2025-Audio-Encoder-Challenge/ The IEEE International Conference on Multimedia & Expo (ICME) 2025 Audio En= coder Capability Challenge Overview The ICME 2025 Audio Encoder Capability Challenge, hosted by Xiaomi, Univers= ity of Surrey, and Dataocean AI, aims to rigorously evaluate audio encoders= in real-world downstream tasks. This challenge imposes NO restrictions on model size or the scale of traini= ng data, and training based on existing pre-trained models is allowed. Participants are invited to submit pre-trained encoders that convert raw au= dio waveforms into continuous embeddings. These encoders will undergo compr= ehensive testing across diverse tasks spanning speech, environmental sounds= , and music. The evaluation will emphasize real-world usability and leverag= e an open-source evaluation system<https://github.com/jimbozhang/xares>. Participants are welcome to independently test and optimize their models. H= owever, the final rankings will be determined based on evaluations conducte= d by the organizers. Registration To participate, registration is required. Please complete the registration = form<https://forms.gle/VGgRQdPLs9f72UM8A> before April 1, 2025. Note that t= his does not means the challenge starts on April 1, 2025. The challenge beg= ins on February 7, 2025. For any other information about registration, please send Email to: 2025icm= e-aecc@xxxxxxxx<mailto:2025icme-aecc@xxxxxxxx> Submission 1. Clone the audio encoder template from the GitHub repository<https://g= ithub.com/jimbozhang/xares-template.git>. 2. Implement your own audio encoder following the instructions in README= .md within the cloned repository. The implementation must pass all checks i= n audio_encoder_checker.py provided in the repository. 3. Before the submission deadline, April 30, 2025, email the following f= iles to the organizers at 2025icme-aecc@xxxxxxxx<mailto:2025icme-aec= c@xxxxxxxx>: * a ZIP file containing the complete repository * a technical report paper (PDF format) not exceeding 6 pages describin= g your implementation The pre-trained model weights can either be included in the ZIP file or dow= nloaded automatically from external sources (e.g., Hugging Face) during run= time. If choosing the latter approach, please implement the automatic downl= oading mechanism in your encoder implementation. While there are no strict limitations on model size, submitted models must = be able to be run successfully in a Google Colab T4 environment, where the = runtime is equipped with a 16 GB NVIDIA Tesla T4 GPU, 12GB RAM. More details can be found from the following webpage: https://dataoceanai.github.io/ICME2025-Audio-Encoder-Challenge/ Thanks for your attention. Sorry for cross-posting. Best wishes, Wenwu -- Wenwu Wang Professor of Signal Processing and Machine Learning, Centre for Vision Speech and Signal Processing (CVSSP) Associate Head of External Engagement, School of Computer Science and Electronic Engineering AI Fellow, Surrey Institute for People Centred AI University of Surrey Guildford, GU2 7XH United Kingdom Phone: +44 (0) 1483 686039 Fax: +44 (0) 1483 686031 Email: w.wang@xxxxxxxx https://personalpages.surrey.ac.uk/w.wang/ --_000_PAXPR06MB7422E1D597C23F628872FB9DBACD2PAXPR06MB7422eurp_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <html> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-8859-= 1"> <style type=3D"text/css" style=3D"display:none;"> P {margin-top:0;margin-bo= ttom:0;} </style> </head> <body dir=3D"ltr"> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = line-height: 1.25; margin: 0px 0px 16px; padding-bottom: 0.3em; border-bott= om: 1px solid rgb(234, 236, 239); font-family: Aptos, Aptos_EmbeddedFont, A= ptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color:= rgb(3, 102, 214);"> <span style=3D"font-weight: 600;"><a href=3D"https://dataoceanai.github.io/= ICME2025-Audio-Encoder-Challenge/" id=3D"OWAb35f75f9-7add-8899-9faf-206e30a= 4257b" class=3D"OWAAutoLink" style=3D"color: rgb(3, 102, 214);">ICME2025-Au= dio-Encoder-Challenge</a></span></div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = line-height: 1.25; margin: 0px 0px 16px; padding-bottom: 0.3em; border-bott= om: 1px solid rgb(234, 236, 239); font-family: Aptos, Aptos_EmbeddedFont, A= ptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color:= rgb(0, 51, 0);"> <a href=3D"https://dataoceanai.github.io/ICME2025-Audio-Encoder-Challenge/"= id=3D"LPlnk">https://dataoceanai.github.io/ICME2025-Audio-Encoder-Challeng= e/</a></div> <div style=3D"font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, = Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 51, 0);"> <span style=3D"font-weight: 600;">The IEEE International Conference on Mult= imedia &amp; Expo (ICME) 2025 Audio Encoder Capability Challenge</span></di= v> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = line-height: 1.25; margin-top: 24px; margin-bottom: 16px; padding-bottom: 0= .3em; border-bottom: 1px solid rgb(234, 236, 239); font-family: Aptos, Apto= s_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-s= ize: 12pt; color: rgb(0, 51, 0);"> <span style=3D"font-weight: 600;">Overview</span></div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = margin-top: 0px; margin-bottom: 16px; font-family: Aptos, Aptos_EmbeddedFon= t, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; co= lor: rgb(0, 51, 0);"> The <span style=3D"font-weight: 600;">ICME 2025 Audio Encoder Capability Ch= allenge</span>, hosted by Xiaomi, University of Surrey, and Dataocean AI, a= ims to rigorously evaluate audio encoders in real-world downstream tasks.</= div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = margin-top: 0px; margin-bottom: 16px; font-family: Aptos, Aptos_EmbeddedFon= t, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; co= lor: rgb(0, 51, 0);"> T<span style=3D"font-weight: 600;">his challenge imposes NO restrictions on= model size or the scale of training data, and training based on existing p= re-trained models is allowed</span>.</div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = margin-top: 0px; margin-bottom: 16px; font-family: Aptos, Aptos_EmbeddedFon= t, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt;"> <span style=3D"color: rgb(0, 51, 0);">Participants are invited to submit pr= e-trained encoders that convert raw audio waveforms into continuous embeddi= ngs. These encoders will undergo comprehensive testing across diverse tasks= spanning speech, environmental sounds, and music. The evaluation will emphasize real-world usability and leverage= an </span> <span style=3D"color: rgb(3, 102, 214);"><a href=3D"https://github.com/jimb= ozhang/xares" id=3D"OWAc6014b2d-ee12-ac42-d369-20694db4619a" class=3D"OWAAu= toLink" style=3D"color: rgb(3, 102, 214);">open-source evaluation system</a= ></span><span style=3D"color: rgb(0, 51, 0);">.</span></div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = margin-top: 0px; margin-bottom: 16px; font-family: Aptos, Aptos_EmbeddedFon= t, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; co= lor: rgb(0, 51, 0);"> Participants are welcome to independently test and optimize their models. H= owever, the final rankings will be determined based on evaluations conducte= d by the organizers.</div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 51, 0);"> <span style=3D"font-weight: 600;"><br> </span></div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 51, 0);"> <span style=3D"font-weight: 600;">Registration</span></div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = margin-top: 0px; margin-bottom: 16px; font-family: Aptos, Aptos_EmbeddedFon= t, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt;"> <span style=3D"color: rgb(0, 51, 0);">To participate, registration is requi= red. Please complete the </span><span style=3D"color: rgb(3, 102, 214);"><a href=3D"https://forms.gl= e/VGgRQdPLs9f72UM8A" id=3D"OWA52c7f89c-7713-aa0e-eb46-c0b9e0953dae" class= =3D"OWAAutoLink" style=3D"color: rgb(3, 102, 214);">registration form</a></= span><span style=3D"color: rgb(0, 51, 0);">&nbsp;before </span><span style=3D"color: rgb(0, 51, 0); font-weight: 600;">April 1, 202= 5</span><span style=3D"color: rgb(0, 51, 0);">. Note that this does not mea= ns the challenge starts on April 1, 2025. The challenge begins on </span><span style=3D"color: rgb(0, 51, 0); font-weight: 600;">February 7, = 2025</span><span style=3D"color: rgb(0, 51, 0);">.</span></div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = margin-top: 0px; margin-bottom: 16px; font-family: Aptos, Aptos_EmbeddedFon= t, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt;"> <span style=3D"color: rgb(0, 51, 0);">For any other information about regis= tration, please send Email to: </span><span style=3D"color: rgb(3, 102, 214);"><a href=3D"mailto:2025icme-= aecc@xxxxxxxx" id=3D"OWAc8a1ba80-ca87-8481-c2a3-66c3acfacfe3" class= =3D"OWAAutoLink" style=3D"color: rgb(3, 102, 214);">2025icme-aecc@xxxxxxxx= ai.com</a></span></div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = line-height: 1.25; margin-top: 24px; margin-bottom: 16px; font-family: Apto= s, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif;= font-size: 12pt; color: rgb(0, 51, 0);"> <span style=3D"font-weight: 600;">Submission</span></div> <ol start=3D"1" style=3D"text-align: left; margin-top: 0px; margin-bottom: = 16px; padding-left: 2em;"> <li style=3D"font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, C= alibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 51, 0);"> Clone the audio encoder template from the <span style=3D"color: rgb(3, 102,= 214);"> <a href=3D"https://github.com/jimbozhang/xares-template.git" id=3D"OWA050dc= ec2-efa5-be5b-de07-901f438720e4" class=3D"OWAAutoLink" style=3D"color: rgb(= 3, 102, 214);">GitHub repository</a></span>.</li><li style=3D"font-family: = Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-se= rif; font-size: 12pt; color: rgb(0, 51, 0); margin-top: 0.25em;"> Implement your own audio encoder following the instructions in <span style= =3D"font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Men= lo, Courier, monospace;"> <code style=3D"font-family: SFMono-Regular, Consolas, &quot;Liberation Mono= &quot;, Menlo, Courier, monospace;">README.md</code></span>&nbsp;within the= cloned repository. The implementation must pass all checks in <span style=3D"font-family: SFMono-Regular, Consolas, &quot;Liberation Mono= &quot;, Menlo, Courier, monospace;"> <code style=3D"font-family: SFMono-Regular, Consolas, &quot;Liberation Mono= &quot;, Menlo, Courier, monospace;">audio_encoder_checker.py</code></span>&= nbsp;provided in the repository.</li><li style=3D"font-family: Aptos, Aptos= _EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-si= ze: 12pt; color: rgb(0, 51, 0); margin-top: 0.25em;"> Before the submission deadline, <span style=3D"font-weight: 600;">April 30,= 2025</span>, email the following files to the organizers at <span style=3D"color: rgb(3, 102, 214);"><a href=3D"mailto:2025icme-aecc@xxxxxxxx= taoceanai.com" id=3D"OWA9d6abf59-5221-e302-b1fa-7f9c6942be15" class=3D"OWAA= utoLink" style=3D"color: rgb(3, 102, 214);">2025icme-aecc@xxxxxxxx</= a></span>:</li></ol> <ul style=3D"text-align: left; margin-top: 0px; margin-bottom: 16px; paddin= g-left: 2em;"> <li style=3D"font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, C= alibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 51, 0);"> a ZIP file containing the complete repository</li><li style=3D"font-family:= Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-s= erif; font-size: 12pt; color: rgb(0, 51, 0); margin-top: 0.25em;"> a technical report paper (PDF format) not exceeding 6 pages describing your= implementation</li></ul> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = margin-top: 0px; margin-bottom: 16px; font-family: Aptos, Aptos_EmbeddedFon= t, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; co= lor: rgb(0, 51, 0);"> The pre-trained model weights can either be included in the ZIP file or dow= nloaded automatically from external sources (e.g., Hugging Face) during run= time. If choosing the latter approach, please implement the automatic downl= oading mechanism in your encoder implementation.</div> <div class=3D"elementToProof" style=3D"text-align: left; text-indent: 0px; = margin-top: 0px; margin-bottom: 16px; font-family: Aptos, Aptos_EmbeddedFon= t, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; co= lor: rgb(0, 51, 0);"> While there are no strict limitations on model size, submitted models must = be able to be run successfully in a Google Colab T4 environment, where the = runtime is equipped with a <span style=3D"font-weight: 600;">16 GB NVIDIA Tesla T4 GPU, 12GB RAM</span= >.</div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 51, 0);"> More details can be found from the following webpage:</div> <div class=3D"elementToProof" style=3D"margin: 0px; font-family: Aptos, Apt= os_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-= size: 12pt; color: rgb(0, 51, 0);"> <a href=3D"https://dataoceanai.github.io/ICME2025-Audio-Encoder-Challenge/"= id=3D"OWA479a43d1-7829-d950-c856-bee527d34561" class=3D"OWAAutoLink" style= =3D"margin: 0px;">https://dataoceanai.github.io/ICME2025-Audio-Encoder-Chal= lenge/</a>&nbsp;&nbsp;</div> <div class=3D"elementToProof" style=3D"margin: 0px; font-family: Aptos, Apt= os_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-= size: 12pt; color: rgb(0, 51, 0);"> <br> </div> <div class=3D"elementToProof" style=3D"margin: 0px; font-family: Aptos, Apt= os_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-= size: 12pt; color: rgb(0, 51, 0);"> Thanks for your attention. Sorry for cross-posting.&nbsp;</div> <div class=3D"elementToProof" style=3D"margin: 0px; font-family: Aptos, Apt= os_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-= size: 12pt; color: rgb(0, 51, 0);"> <br> </div> <div id=3D"Signature" class=3D"elementToProof"> <div style=3D"font-size:12pt;font-family:Calibri, Arial, Helvetica, sans-se= rif;color:rgb(0, 51, 0)" dir=3D"ltr" id=3D"divtagdefaultwrapper"> </div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; color: rgb(0, 51, 0);"> <span style=3D"font-size: 16px;">Best wishes,<br> &nbsp;<br> Wenwu<br> &nbsp;<br> &nbsp;<br> </span><span style=3D"font-size: 10pt;">--<br> Wenwu Wang</span></div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> <br> </div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> Professor&nbsp;of Signal Processing and Machine Learning,</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> Centre for Vision Speech and Signal Processing (CVSSP)</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> <br> </div> <div style=3D"direction: ltr; margin: 0px; font-family: Calibri, Arial, Hel= vetica, sans-serif; font-size: 10pt; color: rgb(0, 51, 0);"> Associate Head&nbsp;of External Engagement,&nbsp;</div> <div style=3D"direction: ltr; margin: 0px; font-family: Calibri, Arial, Hel= vetica, sans-serif; font-size: 10pt; color: rgb(0, 51, 0);"> School of Computer Science and Electronic Engineering</div> <div style=3D"direction: ltr; margin: 0px; font-family: Calibri, Arial, Hel= vetica, sans-serif; font-size: 10pt; color: rgb(0, 51, 0);"> <br> </div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> AI Fellow,</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> Surrey Institute for People Centred AI</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> <br> </div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> University of Surrey</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> Guildford, GU2 7XH<br> United Kingdom<br> Phone: +44 (0) 1483 686039<br> Fax: +44 (0) 1483 686031<br> Email: w.wang@xxxxxxxx<br> </div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 51, 0);"> <a href=3D"https://personalpages.surrey.ac.uk/w.wang/">https://personalpage= s.surrey.ac.uk/w.wang/</a></div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 13px; color: rgb(0, = 51, 0);"> <br> </div> </div> </body> </html> --_000_PAXPR06MB7422E1D597C23F628872FB9DBACD2PAXPR06MB7422eurp_--


This message came from the mail archive
postings/2025/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University