[AUDITORY] New dataset of British English speech recordings for psychoacoustics (Trevor Cox )


Subject: [AUDITORY] New dataset of British English speech recordings for psychoacoustics
From:    Trevor Cox  <0000017832c3f089-dmarc-request@xxxxxxxx>
Date:    Fri, 18 Mar 2022 08:34:24 +0000

--_000_PAXPR01MB922070D79246E3C21F205FD1ED139PAXPR01MB9220eurp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Dataset can be downloaded from https://doi.org/10.17866/rd.salford.16918180= .v3 released under CC BY 4.0. The Clarity Speech Corpus is a forty speaker British English speech dataset= . The corpus was created for the purpose of running listening tests to gaug= e speech intelligibility and quality in the Clarity Project<http://clarityc= hallenge.org/>, which has the goal of advancing speech signal processing by= hearing aids through a series of challenges. The dataset is suitable for m= achine learning and other uses in speech and hearing technology, acoustics = and psychoacoustics. The data comprises recordings of approximately 10,000 = sentences drawn from the British National Corpus (BNC) with suitable length= , words and grammatical construction for speech intelligibility testing. Th= e collection process involved the selection of a subset of BNC sentences, t= he recording of these produced by 40 British English speakers, and the proc= essing of these recordings to create individual sentence recordings with as= sociated prompts and metadata. More details of how the dataset was created is in this Data in Brief paper = https://doi.org/10.1016/j.dib.2022.107951 Trevor Prof @xxxxxxxx Acoustical Engineering, University of Salford +44 161 295 5474; +44 7986 557419 --_000_PAXPR01MB922070D79246E3C21F205FD1ED139PAXPR01MB9220eurp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr= osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:= //www.w3.org/TR/REC-html40"> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"= > <meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)"> <style><!-- /* Font Definitions */ @xxxxxxxx {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @xxxxxxxx {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; font-size:11.0pt; font-family:"Calibri",sans-serif; mso-fareast-language:EN-US;} a:link, span.MsoHyperlink {mso-style-priority:99; color:#0563C1; text-decoration:underline;} .MsoChpDefault {mso-style-type:export-only; font-size:10.0pt;} @xxxxxxxx WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 72.0pt 72.0pt 72.0pt;} div.WordSection1 {page:WordSection1;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--> </head> <body lang=3D"EN-GB" link=3D"#0563C1" vlink=3D"#954F72" style=3D"word-wrap:= break-word"> <div class=3D"WordSection1"> <p class=3D"MsoNormal">Dataset can be downloaded from <a href=3D"https://do= i.org/10.17866/rd.salford.16918180.v3"> https://doi.org/10.17866/rd.salford.16918180.v3</a> released under CC BY 4.= 0.<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">The Clarity Speech Corpus is a forty speaker British= English speech dataset. The corpus was created for the purpose of running = listening tests to gauge speech intelligibility and quality in the <a href=3D"http://claritychallenge.org/">Clarity Project</a>, which has the= goal of advancing speech signal processing by hearing aids through a serie= s of challenges. The dataset is suitable for machine learning and other use= s in speech and hearing technology, acoustics and psychoacoustics. The data comprises recordings of approximat= ely 10,000 sentences drawn from the British National Corpus (BNC) with suit= able length, words and grammatical construction for speech intelligibility = testing. The collection process involved the selection of a subset of BNC sentences, the recording of thes= e produced by 40 British English speakers, and the processing of these reco= rdings to create individual sentence recordings with associated prompts and= metadata.<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">More details of how the dataset was created is in th= is Data in Brief paper <a href=3D"https://doi.org/10.1016/j.dib.2022.107951">https://doi.org/10.10= 16/j.dib.2022.107951</a><o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Trevor<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Prof @xxxxxxxx<o:p></o:p></p> <p class=3D"MsoNormal">Acoustical Engineering, University of Salford<o:p></= o:p></p> <p class=3D"MsoNormal">+44 161 295 5474; +44 7986 557419<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> </div> </body> </html> --_000_PAXPR01MB922070D79246E3C21F205FD1ED139PAXPR01MB9220eurp_--


This message came from the mail archive
src/postings/2022/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University