[AUDITORY] Call for papers for special sessions in speech and audio neural coding

Subject: [AUDITORY] Call for papers for special sessions in speech and audio neural coding

From: Dennis Xiao <xiao.dennis@xxxxxxxxx>

Date: Wed, 20 Dec 2023 08:38:53 +0800

Arc-authentication-results: i=1; mx.google.com; dkim=pass header.i=@LISTS.MCGILL.CA header.s=SELECTOR1 header.b=tPbBFfxA; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.104 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-owner:list-subscribe:list-unsubscribe:list-help :precedence:to:subject:from:sender:reply-to:date:message-id :mime-version:approved-by:dkim-signature; bh=vIwrkFa1EVlVdkCHoccc5aukSlmWOeaNvxS5zUy1iZM=; fh=5/42mu9FVmfuMp6n0xGXVcDar2H3ENcHt8Uv11Om8gY=; b=JVODbX7TdyF5kLSyW/Qb5r0eZnfvwMMmneArlQheViBMgEYwHY2sosEnt8ko0WZ9Yb zgC3BEHZYZDqhhSWv89YQbxlo6r6S5EFAtOJ98gdtvBsfhctiln/MTo/6f/bhNPTpYZ9 RYN4Hotl23lUbMqcvv6eSH/RIyShuxMZMaWC3KZpmbiOvDZZm98k6jXQL192oBOK/CbH sx+USmI8V50TIXmfSciX0W5AiSnKwKlEXrlUml3l02ZIRuC9tzf4z1duQjD+/dn5G3hw lBMLl8Az8TaWsVaV/fYbL9vmO6tZ/1Pl1QIghU6XVw/BQeBQEcLWC3WLCkzgonINOSnC /6fg==

Arc-seal: i=1; a=rsa-sha256; t=1703049424; cv=none; d=google.com; s=arc-20160816; b=mKBVJG+sVNRpcy5EHOPIRjsd3pp632kLxQl+js1LRTNnlWwVipPT80zdkzRC5UiEZp 411XaBxAB+Ic8kebtDM9w/aT2WeqUhH49cVbqprObk5wWRP/H/ZBQQpBoiFM8xgpsb3b 3FW0gx4o7X151td3pV87jdZwSc+kGnFS8BoyAmmvhzxqLhB0LiPwPnmSa+VzaA8Iv1ZM DKrElBdibVRxPWyJLa2Bexr2EAdQHljjIaM5+xefnLM9/w4goxxL7JqCJ7F3jOlxPP7a 7+RTM52Sg0GcibQqSH0ceD71NNVuDVjlGhEIKigiLnWNMBXIiNNA1LKhR4D031tyWsJ1 Z5lQ==

Authentication-results: mx.google.com; dkim=pass header.i=@LISTS.MCGILL.CA header.s=SELECTOR1 header.b=tPbBFfxA; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.104 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com

Delivered-to: dan.ellis@xxxxxxxxx

Dkim-signature: v=1; a=rsa-sha256; d=LISTS.MCGILL.CA; s=SELECTOR1; c=relaxed/relaxed; bh=vIwrkFa1EVlVdkCHoccc5aukSlmWOeaNvxS5zUy1iZM=; i=@LISTS.MCGILL.CA; h=Approved-By:MIME-Version:Content-Type:Message-ID:Date:Reply-To:Sender:From:Subject:To:List-Help:List-Unsubscribe:List-Subscribe:List-Owner:List-Archive; b=tPbBFfxAMfBy0eg1K/cCFyXQd/AkMUtg41Rgp8BzubwzDzuFgkwF6UqHQkc+gwAkKNnbzwZ+2xaHniISNI80rEafJBrQTueKBRs66bkA4Dp+WClO+5vL4fMo5WQdXgR1Jc98Wo+4BO/DYlAMXHHE0nmqbQ3muEMlsFy0XLgHbGuqWZR/RO03Cyv1y/PnLSpqq17Z9+yzRAMYaJ5Mpua1cBhnkWyiOyt0MbXScix+egXoolBZiqanXGmx1R9dvEEnQ40Yr493tDr0EK/UCJS7s5Ohzr6a8dYOI73JNPKLA5PbBNnfq5hH5JN8XlATsk4BzDVME/uhq6H2wkIEPOj3+A==

List-archive: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

List-help: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>

List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>

List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>

List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>

Reply-to: Dennis Xiao <xiao.dennis@xxxxxxxxx>

Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

### Apologies for cross-posting, please distribute ###

Dear all,

We plan to propose a special session in Interspeech 2024 about speech and audio neural coding. Please feel free to contact us if you have any questions.

Best regards,

Wei

### Detail description of the special session proposed ###

Multi-functional neural speech and audio coding

Speech and Audio coding is one of the critical technologies in real-time communication. Traditional coding methods (i.e., signal processing-based ones) mostly rely on physical sound perception and production models as well as basic digital signal processing principles. Recently, deep learning and artificial intelligent (AI) based speech synthesis and audio compression methods were developed. In comparison with the SP-based methods, AI-based approaches bring more possibilities for audio compressioon and are able to achieve better performance with higher compression efficiency. However, the AI-based method (e.g., the neural speech and audio coding) still suffers from certain problems including but not limited to robustness and high computational complexity, which have attracted the attention from many academic and industrial organizations and researchers.

This session proposal aims to collect new ideas and developments in neural coding techniques, including low bitrate and low latency neural coding. We are also looking for new solutions that enable the neural codec to work with different functions such as packet loss concealment, noise reduction, voice conversion, TTS, audio band extension, and AIGC-related topics, etc.

Therefore, we propose to apply a special session in INTERSPEECH 2024. Please feel free to contact us if you have interest to contribute to this special session.

Organizers

Wei XIAO (denniswxiao@xxxxxxxxxxx),

Tencent Ethereal Audio Lab

Prof. Jing WANG (wangjing@xxxxxxxxxx),

Beijing Institute of Technology

Prof. Jingdong CHEN, IEEE Fellow (jingdongchen@xxxxxxxx),

Center of Intelligent Acoustics and Immersive Communications,

Northwestern Polytechnical University

Xuan ZHU (xuan.zhu@xxxxxxxxxxx),

Samsung Research China - Beijing