looking for a transformation to "extract" prosody from speech (Christophe Pallier )


Subject: looking for a transformation to "extract" prosody from speech
From:    Christophe Pallier  <pallier(at)LSCP.EHESS.FR>
Date:    Thu, 7 May 1998 15:21:44 +0200

Message en plusieurs parties et au format MIME. ------=_NextPart_000_0056_01BD79CB.D851BF80 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear list, I would like to know whether it is possible to identify one's language=20 (among others) on the basis of prosodic properties like intonation and = rhythm.=20 Does anyone know if this issue has already been seriously investigated?=20 I am looking for transformations to apply to the speech signal=20 to keep its prosodic properties (i.e. intonation and rhythm) while=20 deleting (or replacing), as far as possible, the phones. I have considered two ideas:=20 1. use low pass filtering (under 300-400 Hz).=20 =3D> but I would like stimuli that sound more like speech.=20 2. use speech resynthesis, e.g. with a diphone synthesizer. =3D> but labeling the original utterances is too time-consuming.=20 Any suggestions? Christophe Pallier, LSCP, CNRS-EHESS 54 bd Raspail, Paris, France. ------=_NextPart_000_0056_01BD79CB.D851BF80 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <!DOCTYPE HTML PUBLIC "-//W3C//DTD W3 HTML//EN"> <HTML> <HEAD> <META content=3Dtext/html;charset=3Diso-8859-1 = http-equiv=3DContent-Type> <META content=3D'"MSHTML 4.72.2106.6"' name=3DGENERATOR> </HEAD> <BODY bgColor=3D#ffffff> <DIV><FONT size=3D2>Dear list,</FONT></DIV> <DIV><FONT size=3D2></FONT>&nbsp;</DIV> <DIV><FONT color=3D#000000 size=3D2>I would like to know whether = </FONT><FONT=20 size=3D2>it is possible </FONT><FONT size=3D2>to identify one's language = </FONT></DIV> <DIV><FONT size=3D2>(among others) </FONT><FONT size=3D2>on the basis of = </FONT><FONT size=3D2>prosodic properties like intonation and rhythm.=20 </FONT></DIV> <DIV><FONT size=3D2>Does anyone know </FONT><FONT size=3D2>if this issue = has already=20 been seriously investigated? </FONT></DIV> <DIV><FONT size=3D2></FONT>&nbsp;</DIV> <DIV><FONT size=3D2>I am </FONT><FONT size=3D2>looking for = transformations to apply=20 to the speech signal </FONT></DIV> <DIV><FONT size=3D2>to keep its prosodic properties (i.e. intonation and = rhythm)=20 while </FONT></DIV> <DIV><FONT size=3D2>deleting (or replacing), as far as possible, the = </FONT><FONT=20 size=3D2>phones.</FONT></DIV> <DIV>&nbsp;</DIV> <DIV><FONT color=3D#000000 size=3D2>I have considered two ideas: = </FONT></DIV> <DIV><FONT color=3D#000000 size=3D2>&nbsp;1. use low pass filtering = (under 300-400=20 Hz). </FONT></DIV> <DIV><FONT color=3D#000000 size=3D2></FONT><FONT=20 size=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =3D&gt; but I would like stimuli = that sound=20 more like speech. </FONT></DIV> <DIV><FONT size=3D2>&nbsp;2. use speech resynthesis, e.g. with a diphone = synthesizer.</FONT></DIV> <DIV><FONT size=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =3D&gt; but labeling = the original=20 utterances is too time-consuming. </FONT></DIV> <DIV><FONT size=3D2></FONT>&nbsp;</DIV> <DIV><FONT color=3D#000000 size=3D2>Any suggestions?</FONT></DIV> <DIV>&nbsp;</DIV> <DIV><FONT size=3D2>Christophe Pallier,</FONT></DIV> <DIV><FONT size=3D2>LSCP, CNRS-EHESS</FONT></DIV> <DIV><FONT size=3D2>54 bd Raspail, Paris, France.</FONT></DIV> <DIV><FONT size=3D2></FONT>&nbsp;</DIV></BODY></HTML> ------=_NextPart_000_0056_01BD79CB.D851BF80--


This message came from the mail archive
http://www.auditory.org/postings/1998/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University