[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
looking for a transformation to "extract" prosody from speech
Message en plusieurs parties et au format MIME.
------=_NextPart_000_0056_01BD79CB.D851BF80
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Dear list,
I would like to know whether it is possible to identify one's language=20
(among others) on the basis of prosodic properties like intonation and =
rhythm.=20
Does anyone know if this issue has already been seriously investigated?=20
I am looking for transformations to apply to the speech signal=20
to keep its prosodic properties (i.e. intonation and rhythm) while=20
deleting (or replacing), as far as possible, the phones.
I have considered two ideas:=20
1. use low pass filtering (under 300-400 Hz).=20
=3D> but I would like stimuli that sound more like speech.=20
2. use speech resynthesis, e.g. with a diphone synthesizer.
=3D> but labeling the original utterances is too time-consuming.=20
Any suggestions?
Christophe Pallier,
LSCP, CNRS-EHESS
54 bd Raspail, Paris, France.
------=_NextPart_000_0056_01BD79CB.D851BF80
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD W3 HTML//EN">
<HTML>
<HEAD>
<META content=3Dtext/html;charset=3Diso-8859-1 =
http-equiv=3DContent-Type>
<META content=3D'"MSHTML 4.72.2106.6"' name=3DGENERATOR>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT size=3D2>Dear list,</FONT></DIV>
<DIV><FONT size=3D2></FONT> </DIV>
<DIV><FONT color=3D#000000 size=3D2>I would like to know whether =
</FONT><FONT=20
size=3D2>it is possible </FONT><FONT size=3D2>to identify one's language =
</FONT></DIV>
<DIV><FONT size=3D2>(among others) </FONT><FONT size=3D2>on the basis of =
</FONT><FONT size=3D2>prosodic properties like intonation and rhythm.=20
</FONT></DIV>
<DIV><FONT size=3D2>Does anyone know </FONT><FONT size=3D2>if this issue =
has already=20
been seriously investigated? </FONT></DIV>
<DIV><FONT size=3D2></FONT> </DIV>
<DIV><FONT size=3D2>I am </FONT><FONT size=3D2>looking for =
transformations to apply=20
to the speech signal </FONT></DIV>
<DIV><FONT size=3D2>to keep its prosodic properties (i.e. intonation and =
rhythm)=20
while </FONT></DIV>
<DIV><FONT size=3D2>deleting (or replacing), as far as possible, the =
</FONT><FONT=20
size=3D2>phones.</FONT></DIV>
<DIV> </DIV>
<DIV><FONT color=3D#000000 size=3D2>I have considered two ideas: =
</FONT></DIV>
<DIV><FONT color=3D#000000 size=3D2> 1. use low pass filtering =
(under 300-400=20
Hz). </FONT></DIV>
<DIV><FONT color=3D#000000 size=3D2></FONT><FONT=20
size=3D2> =3D> but I would like stimuli =
that sound=20
more like speech. </FONT></DIV>
<DIV><FONT size=3D2> 2. use speech resynthesis, e.g. with a diphone =
synthesizer.</FONT></DIV>
<DIV><FONT size=3D2> =3D> but labeling =
the original=20
utterances is too time-consuming. </FONT></DIV>
<DIV><FONT size=3D2></FONT> </DIV>
<DIV><FONT color=3D#000000 size=3D2>Any suggestions?</FONT></DIV>
<DIV> </DIV>
<DIV><FONT size=3D2>Christophe Pallier,</FONT></DIV>
<DIV><FONT size=3D2>LSCP, CNRS-EHESS</FONT></DIV>
<DIV><FONT size=3D2>54 bd Raspail, Paris, France.</FONT></DIV>
<DIV><FONT size=3D2></FONT> </DIV></BODY></HTML>
------=_NextPart_000_0056_01BD79CB.D851BF80--