Subject: Call for participation=?iso-8859-1?Q?=A0-_?=The Interspeech 2026 Audio Encoder Capability Challenge for Large Audio Language Models From: Wenwu Wang <000000615c5e5fae-dmarc-request@xxxxxxxx> Date: Mon, 5 Jan 2026 00:34:22 +0000--_000_PR3PR06MB68607360ED48AD88C962473EBA86APR3PR06MB6860eurp_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Call for participation The Interspeech 2026 Audio Encoder Capability Challenge for Large Audio Lan= guage Models The Interspeech 2026 Audio Encoder Capability Challenge, hosted by Xiaomi, = University of Surrey, Tsinghua University and Dataocean AI, evaluates pre-t= rained audio encoders as front-end modules for large audio language models = (LALM), focusing on their ability to understand and represent audio semanti= cs in complex scenarios. The challenge adopts a unified end-to-end generative evaluation framework. = Participants only need to submit a pre-trained encoder model, while the dow= nstream task training and evaluation are completed by the organizers. The o= rganizers provide XARES-LLM benchmark<https://urldefense.com/v3/__https://g= ithub.com/xiaomi-research/xares-llm.git__;!!BDUfV1Et5lrpZQ!V_U0Z2bfe9VDEjJX= wRp8fZ5u1xlq1WuXzQ1HBAkVtivsydagjdixuhu3RhX_EdYi_Wowa3M5ci8pBK0wTdABylsMoGx= v0ABfl3Bp$ >, an open-source evaluation system. XARES-LLM trains a typical = LALM using the audio encoder provided by the user. The system automatically= downloads training data, trains the LALM then tests various downstream tas= ks, providing scores for each. More details can be found from: https://urldefense.com/v3/__https://dataoce= anai.github.io/Interspeech2026-Audio-Encoder-Challenge/__;!!BDUfV1Et5lrpZQ!= V_U0Z2bfe9VDEjJXwRp8fZ5u1xlq1WuXzQ1HBAkVtivsydagjdixuhu3RhX_EdYi_Wowa3M5ci8= pBK0wTdABylsMoGxv0IZeCbTe$=20 Timelines: * December 15, 2025: Challenge announcement * February 12 23:59 AoE, 2026: Submissions Deadline * February 20, 2026: Final Ranking Announced * February 25 23:59 AoE, 2026: Paper Submission Deadline Apologies for cross-posting. Best wishes, Wenwu -- Wenwu Wang Professor of Signal Processing and Machine Learning, Centre for Vision Speech and Signal Processing (CVSSP) Associate Head of External Engagement, School of Computer Science and Electronic Engineering AI Fellow, Surrey Institute for People Centred AI University of Surrey Guildford, GU2 7XH United Kingdom Phone: +44 (0) 1483 686039 Fax: +44 (0) 1483 686031 Email: w.wang@xxxxxxxx https://urldefense.com/v3/__https://personalpages.surrey.ac.uk/w.wang/__;!!= BDUfV1Et5lrpZQ!V_U0Z2bfe9VDEjJXwRp8fZ5u1xlq1WuXzQ1HBAkVtivsydagjdixuhu3RhX_= EdYi_Wowa3M5ci8pBK0wTdABylsMoGxv0MTQoD-3$=20 --_000_PR3PR06MB68607360ED48AD88C962473EBA86APR3PR06MB6860eurp_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <html> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-8859-= 1"> <style type=3D"text/css" style=3D"display:none;"> P {margin-top:0;margin-bo= ttom:0;} </style> </head> <body dir=3D"ltr"> <div style=3D"direction: ltr; line-height: 1.284; margin: 0cm 0cm 8pt; font= -family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica= , sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"> <b>Call for participation </b></div> <div style=3D"direction: ltr; line-height: 1.284; margin: 0cm 0cm 8pt; font= -family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica= , sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"> <b>The Interspeech 2026 Audio Encoder Capability Challenge for Large Audio = Language Models</b></div> <div style=3D"direction: ltr; line-height: 1.284; margin: 0cm 0cm 8pt; font= -family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica= , sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"> The <b>Interspeech 2026 Audio Encoder Capability Challenge</b>, hosted by X= iaomi, University of Surrey, Tsinghua University and Dataocean AI, evaluate= s pre-trained audio encoders as front-end modules for large audio language = models (LALM), focusing on their ability to understand and represent audio semantics in complex scenarios.<= /div> <div style=3D"direction: ltr; line-height: 1.284; margin: 0cm 0cm 8pt; font= -family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica= , sans-serif; font-size: 12pt;"> <span style=3D"color: rgb(0, 0, 0);">The challenge adopts a unified end-to-= end generative evaluation framework. Participants only need to submit a pre= -trained encoder model, while the downstream task training and evaluation a= re completed by the organizers. The organizers provide </span><span style=3D"color: rgb(5, 99, 193);"><u><a st= yle=3D"color: rgb(5, 99, 193);" data-auth=3D"NotApplicable" class=3D"x_OWAA= utoLink" id=3D"OWA66067bdb-7859-f0d3-2635-f6edac03c33b" href=3D"https://url= defense.com/v3/__https://github.com/xiaomi-research/xares-llm.git__;!!BDUfV= 1Et5lrpZQ!V_U0Z2bfe9VDEjJXwRp8fZ5u1xlq1WuXzQ1HBAkVtivsydagjdixuhu3RhX_EdYi_= Wowa3M5ci8pBK0wTdABylsMoGxv0ABfl3Bp$">XARES-LLM benchmark</a></u></span><span style=3D"color: rgb(0, 0, 0);">, an open-sou= rce evaluation system. XARES-LLM trains a typical LALM using the audio enco= der provided by the user. The system automatically downloads training data,= trains the LALM then tests various downstream tasks, providing scores for each.</span></div> <div style=3D"direction: ltr; line-height: 1.284; margin: 0cm 0cm 8pt; font= -family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica= , sans-serif; font-size: 12pt;"> <span style=3D"color: rgb(0, 0, 0);">More details can be found from: </span= ><span style=3D"color: rgb(5, 99, 193);"><u><a style=3D"color: rgb(5, 99, 1= 93);" data-auth=3D"NotApplicable" class=3D"x_OWAAutoLink" id=3D"OWAa2af6c5b= -9d35-47a4-6e11-3cf26e3f1467" href=3D"https://urldefense.com/v3/__https://d= ataoceanai.github.io/Interspeech2026-Audio-Encoder-Challenge/__;!!BDUfV1Et5= lrpZQ!V_U0Z2bfe9VDEjJXwRp8fZ5u1xlq1WuXzQ1HBAkVtivsydagjdixuhu3RhX_EdYi_Wowa= 3M5ci8pBK0wTdABylsMoGxv0IZeCbTe$">https://dataoceanai.github.io/Interspeech= 2026-Audio-Encoder-Challenge/</a></u></span></div> <div style=3D"direction: ltr; line-height: 1.284; margin: 0cm 0cm 8pt; font= -family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica= , sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"> Timelines:</div> <ul style=3D"direction: ltr; margin-top: 0cm; margin-right: 0cm; padding-le= ft: 0px;"> <li style=3D"font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, C= alibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); direct= ion: ltr; line-height: 107%; margin: 0cm 0cm 0cm 35.7pt;"> <b>December 15, 2025</b>: Challenge announcement</li><li style=3D"font-fami= ly: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, san= s-serif; font-size: 12pt; color: rgb(0, 0, 0); direction: ltr; line-height:= 107%; margin: 0cm 0cm 0cm 35.7pt;"> <b>February 12 23:59 AoE, 2026</b>: Submissions Deadline</li><li style=3D"f= ont-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvet= ica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); direction: ltr; line= -height: 107%; margin: 0cm 0cm 0cm 35.7pt;"> <b>February 20, 2026</b>: Final Ranking Announced</li><li style=3D"font-fam= ily: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sa= ns-serif; font-size: 12pt; color: rgb(0, 0, 0); direction: ltr; line-height= : 107%; margin: 0cm 0cm 0cm 35.7pt;"> <b>February 25 23:59 AoE, 2026</b>: Paper Submission Deadline</li></ul> <div style=3D"direction: ltr; line-height: 1.284; margin: 0cm 0cm 8pt; font= -family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica= , sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"> Apologies for cross-posting.</div> <div id=3D"x_Signature"> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; color: rgb(0, 0, 0);"> <span style=3D"font-size: 16px;">Best wishes,<br> <br> Wenwu<br> <br> <br> </span><span style=3D"font-size: 10pt;">--<br> Wenwu Wang</span></div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> <br> </div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> Professor of Signal Processing and Machine Learning,</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> Centre for Vision Speech and Signal Processing (CVSSP)</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> <br> </div> <div style=3D"direction: ltr; margin: 0px; font-family: Calibri, Arial, Hel= vetica, sans-serif; font-size: 10pt; color: rgb(0, 0, 0);"> Associate Head of External Engagement, </div> <div style=3D"direction: ltr; margin: 0px; font-family: Calibri, Arial, Hel= vetica, sans-serif; font-size: 10pt; color: rgb(0, 0, 0);"> School of Computer Science and Electronic Engineering</div> <div style=3D"direction: ltr; margin: 0px; font-family: Calibri, Arial, Hel= vetica, sans-serif; font-size: 10pt; color: rgb(0, 0, 0);"> <br> </div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> AI Fellow,</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> Surrey Institute for People Centred AI</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> <br> </div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> University of Surrey</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> Guildford, GU2 7XH<br> United Kingdom<br> Phone: +44 (0) 1483 686039<br> Fax: +44 (0) 1483 686031<br> Email: w.wang@xxxxxxxx</div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 10pt; color: rgb(0, = 0, 0);"> <a style=3D"color: rgb(0, 0, 0);" data-auth=3D"NotApplicable" class=3D"OWAA= utoLink" id=3D"OWAf78bc763-de89-2bb2-7eb7-231f4767f6a5" href=3D"https://url= defense.com/v3/__https://personalpages.surrey.ac.uk/w.wang/__;!!BDUfV1Et5lr= pZQ!V_U0Z2bfe9VDEjJXwRp8fZ5u1xlq1WuXzQ1HBAkVtivsydagjdixuhu3RhX_EdYi_Wowa3M= 5ci8pBK0wTdABylsMoGxv0MTQoD-3$">https://personalpages.surrey.ac.uk/w.wang/<= /a></div> <div style=3D"direction: ltr; margin-right: 0px; margin-left: 0px; font-fam= ily: Calibri, Arial, Helvetica, sans-serif; font-size: 13px; color: rgb(0, = 51, 0);"> <br> </div> </div> </body> </html> --_000_PR3PR06MB68607360ED48AD88C962473EBA86APR3PR06MB6860eurp_--