[AUDITORY] CFP: IEEE JSTSP Special Issue - deadline extended till 11 Nov 2024 ("Hussain, Amir" )

Subject: [AUDITORY] CFP: IEEE JSTSP Special Issue - deadline extended till 11 Nov 2024 From: "Hussain, Amir" <0000016168b2549a-dmarc-request@xxxxxxxx> Date: Mon, 21 Oct 2024 19:06:47 +0000 --_000_LO6P302MB00143239CA4731E74522C7E4D3432LO6P302MB0014GBRP_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear List (with apologies for any cross-postings) Due to numerous requests from authors, the deadline for the Special Issue (SI) of the IEEE Journal of Selected Topics in Signal Processing (JSTSP) on "Deep Multimodal Speech Enhancement and Separation" has been extended till 11 November 2024 - see CFP below and here: https://signalprocessingsociety.org/publications-resources/special-issue-de= adlines/ieee-jstsp-special-issue-deep-multimodal-speech-enhancement-and-sep= aration CFP: IEEE Journal of Selected Topics in Signal Processing (JSTSP) Special Issue (SI) on: Deep Multimodal Speech Enhancement and Separation Manuscripts Due: 11 November 2024 (final extension) SI Publication Date: May 2025 Scope: Voice is the most commonly used modality by humans to communicate and psychologically blend into society. Recent technological advances have triggered the development of various voice-related applications in the information and communications technology market. However, noise, reverberation, and interfering speech are detrimental for effective communications between humans and other humans or machines, leading to performance degradation of associated voice-enabled services. To address the formidable speech-in-noise challenge, a range of speech enhancement (SE) and speech separation (SS) techniques are normally employed as important front-end speech processing units to handle distortions in input signals in order to provide more intelligible speech for automatic speech recognition (ASR), synthesis and dialogue systems. Emerging advances in artificial intelligence (AI) and machine learning, particularly deep neural networks, have led to remarkable improvements in SE and SS based solutions. A growing number of researchers have explored various extensions of these methods by utilising a variety of modalities as auxiliary inputs to the main speech processing task to access additional information from heterogeneous signals. In particular, multi-modal SE and SS systems have been shown to deliver enhanced performance in challenging noisy environments by augmenting the conventional speech modality with complementary information from multi-sensory inputs, such as video, noise type, signal-to-noise ratio (SNR), bone-conducted speech (vibrations), speaker, text information, electromyography, and electromagnetic midsagittal articulometer (EMMA) data. Various integration schemes, including early and late fusions, cross-attention mechanisms, and self-supervised learning algorithms, have also been successfully explored. Topics: This timely special issue aims to collate latest advances in multi-modal SE and SS systems that exploit both conventional and unconventional modalities to further improve state-of-the-art performance in benchmark problems. We particularly welcome submissions for novel deep neural network based algorithms and architectures, including new feature processing methods for multimodal and cross-modal speech processing. We also encourage submissions that address practical issues related to multimodal data recording, energy-efficient system design and real-time low-latency solutions, such as for assistive hearing and speech communication applications. Special Issue research topics of interest relate to open problems needing addressed These include, but are not limited to, the following. - Novel acoustic features and architectures for multi-modal SE (MM-SE) and multi-modal SS (MM-SS) solutions. - Self-supervised and unsupervised learning techniques for MM-SE and MM-SS systems. - Adversarial learning for MM-SE and MM-SS. - Large language model-based Generative approaches for MM-SE and MM-SS - Low-delay, low-power, low-complexity MM-SE and MM-SS models - Integration of multiple data acquisition devices for multimodal learning and novel learning algorithms robust to imperfect data. - Few-shot/zero-shot learning and adaptation algorithms for MM-SE and MM-SS systems with a small amount of training and adaptation data. - Approaches that effectively reduce model size and inference cost without reducing the speech quality and intelligibility of processed signals. - Novel objective functions including psychoacoustics and perceptually motivated loss functions for MM-SE and MM-SS - Holistic evaluation metrics for MM-SE and MM-SS systems. - Real-world applications and use-cases of MM-SE and MM-SS, including human-human and human-machine communications - Challenges and solutions in the integration of MM-SE and MM-SS into existing systems We encourage submissions that not only propose novel approaches but also substantiate the findings with rigorous evaluations, including real-world datasets. Studies that provide insights into the challenges involved and the impact of MM-SE and MM-SS on end-users are particularly welcome. Submission Guidelines: Manuscripts should be original and should not have been previously published or currently under consideration for publication elsewhere. All submissions will be peer-reviewed according to the IEEE Signal Processing Society review process. Authors should prepare their manuscripts according to the Instructions for Authors available from the Signal Processing Society website. Important Dates Manuscript Submission Deadline: 11 November 2024 First Review Due: 15 December 2024 Revised Manuscript Due: 15 January 2024 Second Review Due: 15 February 2024 Final Decision: 28 February 2025 Guest Editors: Amir Hussain, Edinburgh Napier University, UK Yu Tsao, Academia Sinica, Taiwan John H.L. Hansen, University of Texas at Dallas, USA Naomi Harte, Trinity College Dublin, Ireland Shinji Watanabe, Carnegie Mellon University, USA Isabel Trancoso, Instituto Superior T=E9cnico, IST, Univ. Lisbon, Portugal Shixiong Zhang, Tencent AI Lab, USA We look forward to your submissions. Many thanks, On behalf of the Guest Editorial Team -- Prof Amir Hussain School of Computing, Edinburgh Napier University, Scotland, UK E-mail: A.Hussain@xxxxxxxx<mailto:A.Hussain@xxxxxxxx> http://cogmhear.org<http://cogmhear.org/> This message and its attachment(s) are intended for the addressee(s) only a= nd should not be read, copied, disclosed, forwarded or relied upon by any p= erson other than the intended addressee(s) without the permission of the se= nder. If you are not the intended addressee you must not take any action ba= sed on this message and its attachment(s) nor must you copy or show them to= anyone. Please respond to the sender and ensure that this message and its = attachment(s) are deleted. It is your responsibility to ensure that this message and its attachment(s)= are scanned for viruses or other defects. Edinburgh Napier University does= not accept liability for any loss or damage which may result from this mes= sage or its attachment(s), or for errors or omissions arising after it was = sent. Email is not a secure medium. Emails entering Edinburgh Napier Univer= sity's system are subject to routine monitoring and filtering by Edinburgh = Napier University. Edinburgh Napier University is a registered Scottish charity. Registration = number SC018373 BSL users can contact us via contactSCOTLAND-BSL, the on-line British Sign = Language interpreting service. Find out more on the contactSCOTLAND website= . --_000_LO6P302MB00143239CA4731E74522C7E4D3432LO6P302MB0014GBRP_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <html> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-8859-= 1"> <style type=3D"text/css" style=3D"display:none;"> P {margin-top:0;margin-bo= ttom:0;} </style> </head> <body dir=3D"ltr"> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 0, 0);"> Dear List (with apologies for any cross-postings) Due to numerous requests from authors, the deadline for the Special Issue (SI) of the IEEE Journal of Selected Topics in Signal Processing (JSTSP) on "Deep Multimodal Speech Enhancement and Separation" ha= s been extended till 11 November 2024 - see CFP below and here: <a href=3D"https://signalprocessin= gsociety.org/publications-resources/special-issue-deadlines/ieee-jstsp-spec= ial-issue-deep-multimodal-speech-enhancement-and-separation" target=3D"_bla= nk" id=3D"OWAff2f598a-dfb9-0c1f-883f-239ac527d2c3" class=3D"OWAAutoLink" re= l=3D"noreferrer" data-saferedirecturl=3D"https://www.google.com/url?q=3Dhtt= ps://signalprocessingsociety.org/publications-resources/special-issue-deadl= ines/ieee-jstsp-special-issue-deep-multimodal-speech-enhancement-and-separa= tion&source=3Dgmail&ust=3D1729623315416000&usg=3DAOvVaw0uYJlWuk= HSI9oOfCc31uLL" style=3D"color: rgb(17, 85, 204); text-align: left;">https:= //signalprocessingsociety.org/publica= tions-resources/special-issue-deadlin= es/ieee-jstsp-special-issue-deep-mult= imodal-speech-enhancement-and-separation</a> CFP: IEEE Journal of Selected Topics in Signal Processing (JSTSP) Special Issue (SI) on: Deep Multimodal Speech Enhancement and Separation<br= > Manuscripts Due: 11 November 2024 (final extension) SI Publication Date: May 2025 Scope: Voice is the most commonly used modality by humans to communicate and psychologically blend into society. Recent technological advances have triggered the development of various voice-related applications in the information and communications technology market. However, noise, reverberation, and interfering speech are detrimental for effective communications between humans and other humans or machines, leading to performance degradation of associated voice-enabled services. To address the formidable speech-in-noise challenge, a range of speech enhancement (SE) and speech separation (SS) techniques are normally employed as important front-end speech processing units to handle distortions in input signals in order to provide more intelligible speech for automatic speech recognition (ASR), synthesis and dialogue systems. Emerging advances in artificial intelligence (AI) and machine learning, particularly deep neural networks, have led to remarkable improvements in SE and SS based solutions. A growing number of researchers have explored various extensions of these methods by utilising a variety of modalities as auxiliary inputs to the main speech processing task to access additional information from heterogeneous signals. In particular, multi-modal SE and SS systems have been shown to deliver enhanced performance in challenging noisy environments by augmenting the conventional speech modality with complementary information from multi-sensory inputs, such as video, noise type, signal-to-noise ratio (SNR), bone-conducted speech (vibrations), speaker, text information, electromyography, and electromagnetic midsagittal articulometer (EMMA) data. Various integration schemes, including early and late fusions, cross-attention mechanisms, and self-supervised learning algorithms, have also been successfully explored. Topics: This timely special issue aims to collate latest advances in multi-modal SE and SS systems that exploit both conventional and unconventional modalities to further improve state-of-the-art performance in benchmark problems. We particularly welcome submissions for novel deep neural network based algorithms and architectures, including new feature processing methods for multimodal and cross-modal speech processing. We also encourage submissions that address practical issues related to multimodal data recording, energy-efficient system design and real-time low-latency solutions, such as for assistive hearing and speech communication applications. Special Issue research topics of interest relate to open problems needing addressed These include, but are not limited to, the following. - Novel acoustic features and architectures for multi-modal SE (MM-SE) and multi-modal SS (MM-SS) solutions. - Self-supervised and unsupervised learning techniques for MM-SE and MM-SS systems. - Adversarial learning for MM-SE and MM-SS. - Large language model-based Generative approaches for MM-SE and MM-SS - Low-delay, low-power, low-complexity MM-SE and MM-SS models - Integration of multiple data acquisition devices for multimodal learning and novel learning algorithms robust to imperfect data. - Few-shot/zero-shot learning and adaptation algorithms for MM-SE and MM-SS systems with a small amount of training and adaptation data. - Approaches that effectively reduce model size and inference cost without reducing the speech quality and intelligibility of processed signals. - Novel objective functions including psychoacoustics and perceptually motivated loss functions for MM-SE and MM-SS - Holistic evaluation metrics for MM-SE and MM-SS systems. - Real-world applications and use-cases of MM-SE and MM-SS, including human-human and human-machine communications - Challenges and solutions in the integration of MM-SE and MM-SS into existing systems We encourage submissions that not only propose novel approaches but also substantiate the findings with rigorous evaluations, including real-world datasets. Studies that provide insights into the challenges involved and the impact of MM-SE and MM-SS on end-users are particularly welcome. Submission Guidelines: Manuscripts should be original and should not have been previously published or currently under consideration for publication elsewhere. All submissions will be peer-reviewed according to the IEEE Signal Processing Society review process. Authors should prepare their manuscripts according to the Instructions for Authors available from the Signal Processing Society website. Important Dates Manuscript Submission Deadline: 11 November 2024 First Review Due: 15 December 2024 Revised Manuscript Due: 15 January 2024 Second Review Due: 15 February 2024 Final Decision: 28 February 2025 Guest Editors: Amir Hussain, Edinburgh Napier University, UK Yu Tsao, Academia Sinica, Taiwan John H.L. Hansen, University of Texas at Dallas, USA Naomi Harte, Trinity College Dublin, Ireland Shinji Watanabe, Carnegie Mellon University, USA Isabel Trancoso, Instituto Superior T=E9cnico, IST, Univ. Lisbon, Portugal<= br> Shixiong Zhang, Tencent AI Lab, USA We look forward to your submissions. Many thanks,</div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 0, 0);"> On behalf of the Guest Editorial Team</div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 0, 0);"> -- Prof Amir Hussain</div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 0, 0);"> School of Computing, </div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 0, 0);"> Edinburgh Napier University, Scotland, UK</div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(0, 0, 0);"> E-mail: <a href=3D"mailto:A.Hussai= n@xxxxxxxx" target=3D"_blank" id=3D"OWA1c557dd5-6a29-0ca0-f6ca-aa7e731d= d15c" class=3D"OWAAutoLink" style=3D"color: rgb(17, 85, 204); text-align: l= eft;">A.Hussain@xxxxxxxx</a></div> <div class=3D"elementToProof" style=3D"font-family: Aptos, Aptos_EmbeddedFo= nt, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; c= olor: rgb(17, 85, 204);"> <a href=3D"http://cogmhear.org/" target=3D"_blank" id=3D"OWAfd32a43d-1f17-5= 094-188d-8f033ebfce36" class=3D"OWAAutoLink" rel=3D"noreferrer" data-safere= directurl=3D"https://www.google.com/url?q=3Dhttp://cogmhear.org&source= =3Dgmail&ust=3D1729623315416000&usg=3DAOvVaw19vKc-2gdVfmZxmi1WJ8iZ"= style=3D"color: rgb(17, 85, 204); text-align: left;">http://cogmhear.org</= a></div> <div id=3D"x_x_Signature"> <div style=3D"background-color: rgb(255, 255, 255); margin: 0px;"></div> </div> <table width=3D"100%" cellspacing=3D"0" cellpadding=3D"=3D"0"" bo= rder=3D"0" align=3D"left" style=3D"font-family: Arial, Helvetica, san-serif= ; font-size:10pt;"> <tbody> <tr> <td style=3D"padding: 5px;"> This message and its attachment(s) are intended for the addressee(s) onl= y and should not be read, copied, disclosed, forwarded or relied upon by an= y person other than the intended addressee(s) without the permission of the= sender. If you are not the intended addressee you must not take any action based on this message and its attac= hment(s) nor must you copy or show them to anyone. Please respond to the se= nder and ensure that this message and its attachment(s) are deleted. It is your responsibility to ensure that this message and its attachment= (s) are scanned for viruses or other defects. Edinburgh Napier University d= oes not accept liability for any loss or damage which may result from this = message or its attachment(s), or for errors or omissions arising after it was sent. Email is not a secure m= edium. Emails entering Edinburgh Napier University's system are subject to = routine monitoring and filtering by Edinburgh Napier University. Edinburgh Napier University is a registered Scottish charity. Registrati= on number SC018373 BSL users can contact us via contactSCOTLAND-BSL, the on-line British Si= gn Language interpreting service. Find out more on the contactSCOTLAND webs= ite. </td> </tr> </tbody> </table> </body> </html> --_000_LO6P302MB00143239CA4731E74522C7E4D3432LO6P302MB0014GBRP_--

This message came from the mail archive
postings/2024/
maintained by:

DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University