Re: Wrod Ilntilitgelibiy wtih Pnoheme Cufisonon (Raphael Ullmann )


Subject: Re: Wrod Ilntilitgelibiy wtih Pnoheme Cufisonon
From:    Raphael Ullmann  <raphael.ullmann@xxxxxxxx>
Date:    Fri, 15 Aug 2014 13:26:53 +0200
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--Apple-Mail=_97AAF624-6FCB-4057-8066-624AA00DE6A7 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 Dear David, your question on phoneme confusion is certainly very interesting. I suggest having a look at Stilp&Kluender=92s 2010 PNAS paper [1], which = first discusses the importance of consonants vs. vowels to speech = intelligibility, and then suggests that such linguistic constructs = should be abandoned in favor of sensory measures. More specifically, the authors evaluated the impact of replacing = selected portions of the speech signal with 1/f noise. A measure of the = degree of change in the signal over time (which the authors term = =93cochlea-scaled entropy=94) best predicted which signal portions were = most critical to preserving speech intelligibility. More recently, the cochlea-scaled entropy measure was also used to = decide which speech portions to re-time around (known) fluctuating = maskers, successfully increasing overall intelligibility [2]. However, I am not aware of studies that investigated distortions = consisting of switching certain phonemes to other perceptually nearby = phonemes, as you suggest. Kind regards, Raphael [1] Stilp, C. E. & Kluender, K. R. Cochlea-scaled entropy, not = consonants, vowels, or time, best predicts speech intelligibility. Proc. = Natl. Acad. Sci. U.S.A. 107, 12387=9692 (2010). URL: = http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2901476 [2] Aubanel, V. & Cooke, M. Information-preserving temporal reallocation = of speech in the presence of fluctuating maskers. in Proc. Interspeech = 3592=963596 (2013). URL: = http://laslab.org/upload/information-preserving_temporal_reallocation_of_s= peech_in_the_presence_of_fluctuating_maskers.pdf -- Raphael Ullmann Ph.D. Candidate Idiap Research Institute Ecole Polytechnique F=E9d=E9rale de Lausanne http://idiap.ch/~rullmann/ On 15.08.2014, at 06:59, David Klein <kleinsound@xxxxxxxx> wrote: > Hi All, >=20 > I am seeking references on the subject of human speech intelligibility = as a function of individual phoneme distortions. I can't seem to find = what I'm looking for. Can anybody help point me in the right direction? >=20 > I'd specifically like to know how word intelligibility holds up when = distortions of a particular phoneme class would cause members of that = class to be highly confusable when presented in isolation. >=20 > More generally, I wonder how well humans can do when consonants are = relatively clear but vowels are highly ambiguous. >=20 > I suppose two ways this might have been studied would have been using, = on the one hand, noise or channel distortions specifically targeted to = distorting certain phoneme classes; or, on the other hand, manipulating = the signal by switching certain phonemes to other perceptually nearby = phonemes. >=20 > Cheers, > Dvaid ;) --Apple-Mail=_97AAF624-6FCB-4057-8066-624AA00DE6A7 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dwindows-1252"><meta http-equiv=3D"Content-Type" = content=3D"text/html charset=3Dwindows-1252"></head><body = style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; = -webkit-line-break: after-white-space;"><div>Dear = David,</div><div><br></div><div>your question on phoneme confusion is = certainly very interesting.</div><div><br></div><div>I suggest having a = look at Stilp&amp;Kluender=92s 2010 PNAS paper [1], which first = discusses the importance of consonants vs. vowels to speech = intelligibility, and then suggests that such linguistic constructs = should be abandoned in favor of sensory measures.</div><div>More = specifically, the authors evaluated the impact of replacing selected = portions of the speech signal with 1/f noise. A measure of the degree of = change in the signal over time (which the authors term =93cochlea-scaled = entropy=94) best predicted which signal portions were most critical to = preserving speech intelligibility.</div><div><br></div><div>More = recently, the cochlea-scaled entropy measure was also used to decide = which speech portions to re-time around (known) fluctuating maskers, = successfully increasing overall intelligibility = [2].</div><div><br></div><div>However, I am not aware of studies that = investigated distortions consisting of switching certain phonemes to = other perceptually nearby phonemes, as you = suggest.</div><div><br></div><div>Kind = regards,</div><div>Raphael</div><div><br></div><div><br></div><div>[1] = Stilp, C. E. &amp; Kluender, K. R. Cochlea-scaled entropy, not = consonants, vowels, or time, best predicts speech = intelligibility.&nbsp;Proc.&nbsp;Natl. Acad. Sci. = U.S.A.&nbsp;107,&nbsp;12387=9692 (2010). URL:&nbsp;<a = href=3D"http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2901476">http://www.nc= bi.nlm.nih.gov/pmc/articles/PMC2901476</a></div><div><br></div><div>[2]&nb= sp;Aubanel, V. &amp; Cooke, M. Information-preserving temporal = reallocation of speech in the presence of fluctuating maskers. = in&nbsp;Proc. Interspeech&nbsp;3592=963596 (2013). URL: <a = href=3D"http://laslab.org/upload/information-preserving_temporal_reallocat= ion_of_speech_in_the_presence_of_fluctuating_maskers.pdf">http://laslab.or= g/upload/information-preserving_temporal_reallocation_of_speech_in_the_pre= sence_of_fluctuating_maskers.pdf</a></div><div><br></div><div>--</div><div= >Raphael Ullmann</div><div>Ph.D. Candidate</div><div>Idiap Research = Institute</div><div>Ecole Polytechnique F=E9d=E9rale de = Lausanne</div><div><a = href=3D"http://idiap.ch/~rullmann/">http://idiap.ch/~rullmann/</a></div><d= iv><br></div><br><div><div>On 15.08.2014, at 06:59, David Klein &lt;<a = href=3D"mailto:kleinsound@xxxxxxxx">kleinsound@xxxxxxxx</a>&gt; = wrote:</div><br class=3D"Apple-interchange-newline"><blockquote = type=3D"cite"><div style=3D"background-color: rgb(255, 255, 255); = font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida = Grande', sans-serif; font-size: 10pt;"><div id=3D"yiv5805419883"><div = style=3D"background-color: rgb(255, 255, 255); font-family: = HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', = sans-serif; font-size: 10pt;"><div = id=3D"yiv5805419883yui_3_16_0_1_1408076918952_12177">Hi All,<br><br>I am = seeking references on the subject of human speech intelligibility as a = function of individual phoneme distortions. I can't seem to find what = I'm looking for. Can anybody help point me in the right = direction?<br><br>I'd specifically like to know how word intelligibility = holds up when distortions of a particular phoneme class would cause = members of that class to be highly confusable when presented in = isolation.<br><br>More generally, I wonder how well humans can do when = consonants are relatively clear but vowels are highly = ambiguous.<br><br>I suppose two ways this might have been studied would have been using, on = the one hand, noise or channel distortions specifically targeted to = distorting certain phoneme classes; or, on the other hand, manipulating = the signal by switching certain phonemes to other perceptually nearby = phonemes.<br><br>Cheers,<br>Dvaid = ;)<br></div></div></div></div></blockquote></div><br></body></html>= --Apple-Mail=_97AAF624-6FCB-4057-8066-624AA00DE6A7--


This message came from the mail archive
http://www.auditory.org/postings/2014/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University