soutenance th=?ISO-8859-1?Q?=E8se_Gregory_Beller_-_mercr_edi_24_juin_2009_-_"Analyse_et_mod=E8le__g=E9n=E9ratif_de_l'expressivit=E9."?= (Greg Beller )


Subject: soutenance th=?ISO-8859-1?Q?=E8se_Gregory_Beller_-_mercr_edi_24_juin_2009_-_"Analyse_et_mod=E8le__g=E9n=E9ratif_de_l'expressivit=E9."?=
From:    Greg Beller  <Greg.Beller@xxxxxxxx>
Date:    Sun, 14 Jun 2009 18:30:54 +0200
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

This is a multi-part message in MIME format. --------------090701010205070607060300 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by drizzle.cc.mcgill.ca id n5EGV0Tc002860 English bellow [Apologies if you receive multiple copies] _________________________________________________________________________= ___________________________ Gr=E9gory BELLER soutiendra publiquement sa th=E8se de doctorat IRCAM-Par= is VI: "Analyse et mod=E8le g=E9n=E9ratif de l'expressivit=E9. Application =E0 l= a parole=20 et =E0 l'interpr=E9tation musicale" MERCREDI 24 JUIN 2009 =E0 14H00, en salle Stravinsky, Ircam IRCAM, 1 pl. Igor Stravinsky, Paris* * * *La th=E8se est dirig=E9e par Xavier Rodet, et r=E9alis=E9e =E0 l'Ircam a= u sein de=20 l'=E9quipe Analyse et synth=E8se des sons * * La soutenance de th=E8se se fera devant un jury compos=E9 de : G=E9rard Bailly rapporteur GIPSA-lab Christophe D'Alessandro examinateur LIMSI-CNRS Laurence Devillers rapporteure LIMSI-CNRS Thierry Dutoit examinateur TCTS Axel Roebel examinateur Ircam Xavier Rodet directeur de th=E8se, Ircam Jean-Luc Zarader examinateur ISIR *La soutenance se fera en Fran=E7ais. Elle est publique et vous =EAtes le= s=20 bienvenus dans la limite des places disponibles. Elle sera suivie d'un=20 pot, sur place. Pour ceux qui ne pourraient se d=E9placer, elle sera=20 retransmise =E0 l'adresse suivante: http://video.ircam.fr/ * *R=E9sum=E9* Cette th=E8se s'inscrit dans les recherches actuelles sur les =E9motions = et=20 les r=E9actions =E9motionnelles, sur la mod=E9lisation et la transformati= on de=20 la parole, ainsi que sur l'interpr=E9tation musicale. Il semble que la=20 capacit=E9 d'exprimer, de simuler et d'identifier des =E9motions, des=20 humeurs, des intentions ou des attitudes, soit fondamentale dans la=20 communication humaine. La facilit=E9 avec laquelle nous comprenons l'=E9t= at=20 d'un personnage, =E0 partir de la seule observation du comportement des=20 acteurs et des sons qu'ils =E9mettent, montre que cette source=20 d'information est essentielle et, parfois m=EAme, suffisante dans nos=20 relations sociales. Si l'=E9tat =E9motionnel pr=E9sente la particularit=E9= =20 d'=EAtre idiosyncrasique, c'est-=E0-dire particulier =E0 chaque individu,= il=20 n'en va pas de m=EAme de la r=E9action associ=E9e qui se manifeste par le= =20 geste (mouvement, posture, visage...), le son (voix, musique...), et=20 qui, elle, est observable par autrui. Ce qui nous permet de penser qu'il=20 est possible de transformer cette r=E9action dans le but de modifier la=20 perception de l'=E9motion associ=E9e. C'est pourquoi le paradigme d'analyse-transformation-synth=E8se des=20 r=E9actions =E9motionnelles est, peu =E0 peu, introduit dans les domaines= =20 th=E9rapeutique, commercial, scientifique et artistique. Cette th=E8se=20 s'inscrit dans ces deux derniers domaines et propose plusieurs=20 contributions. D'un point de vue th=E9orique, cette th=E8se propose une d=E9finition de=20 l'expressivit=E9, une d=E9finition de l'expressivit=E9 neutre, un nouveau= mode=20 de repr=E9sentation de l'expressivit=E9, ainsi qu'un ensemble de cat=E9go= ries=20 expressives communes =E0 la parole et =E0 la musique. Elle situe=20 l'expressivit=E9 parmi le recensement des niveaux d'information=20 disponibles dans l'interpr=E9tation qui peut =EAtre vu comme un mod=E8le = de la=20 performance artistique. Elle propose un mod=E8le original de la parole et= =20 de ses constituants, ainsi qu'un nouveau mod=E8le prosodique hi=E9rarchiq= ue. D'un point de vue exp=E9rimental, cette th=E8se fournit un protocole pour= =20 l'acquisition de donn=E9es expressives interpr=E9t=E9es. Colat=E9ralement= , elle=20 rend disponible trois corpus pour l'observation de l'expressivit=E9. Elle= =20 fournit une nouvelle mesure statistique du degr=E9 d'articulation ainsi=20 que plusieurs r=E9sultats d'analyses concernant l'influence de=20 l'expressivit=E9 sur la parole. D'un point de vue technique, elle propose un algorithme de traitement du=20 signal permettant la modification du degr=E9 d'articulation. Elle pr=E9se= nte=20 un syst=E8me de gestion de corpus novateur qui est, d'ores et d=E9j=E0,=20 utilis=E9 par d'autres applications du traitement automatique de la=20 parole, n=E9cessitant la manipulation de corpus. Elle montre=20 l'=E9tablissement d'un r=E9seau bay=E9sien en tant que mod=E8le g=E9n=E9r= atif de=20 param=E8tres de transformation d=E9pendants du contexte. D'un point de vue technologique, un syst=E8me exp=E9rimental de=20 transformation, de haute qualit=E9, de l'expressivit=E9 d'une phrase neut= re,=20 en fran=E7ais, synth=E9tique ou enregistr=E9e, a =E9t=E9 produit. Enfin et surtout, d'un point de vue prospectif, cette th=E8se propose=20 diff=E9rentes pistes de recherche pour l'avenir, tant sur les plans=20 th=E9orique, exp=E9rimental, technique, que technologique. Parmi celles-c= i,=20 la confrontation des manifestations de l'expressivit=E9 dans les=20 interpr=E9tations verbale et musicale semble =EAtre une voie prometteuse. /Mots-cl=E9s/ /=C9motions, expressivit=E9, performance artistique, interpr=E9tation=20 musicale, parole, prosodie, transformation du signal de parole,=20 mod=E9lisation g=E9n=E9rative, apprentissage, r=E9seau bay=E9sien. _________________________________________________________________________= ____________________________________________________________ / Ph.D. Defense: Gr=E9gory Beller, IRCAM-Paris VI Analysis and Generative Model of the Expressivity. Application in the Speech and in the Musical Performance. WEDNESDAY, JUNE 24TH, 2009 at 2:00 pm, in room Stravinsky, Ircam IRCAM, 1 pl. Igor Stravinsky, Paris, France Ph.D. Supervisor: Xavier Rodet (IRCAM) * Ph.D. Examining Board G=E9rard Bailly* rapporteur GIPSA-lab *Christophe D'Alessandro* examiner LIMSI-CNRS *Laurence Devillers* rapporteure LIMSI-CNRS *Thierry Dutoit* examiner TCTS *Axel Roebel* examiner IRCAM *Xavier Rodet* supervisor IRCAM *Jean-Luc Zarader* examiner ISIR The defense is public and will be made in French. It will be followed by=20 a drink, on the spot. For those who could not move, it will be broadcast=20 at the following address: http://video.ircam.fr/ Abstract This thesis joins in the current searches (researches) on the feelings=20 and the emotional reactions, on the modelling and the transformation of=20 the speech, as well as on the musical performance. It seems that the=20 capacity to express, to feign and to identify emotions, humors,=20 intentions or attitudes, is fundamental in the human communication. The=20 ease with which we understand the state of a character, from the only=20 observation of the behavior of the actors and the sounds which the=20 yutter, shows that this source of information is essential and,=20 sometimes, sufficient in our social relationships. If the emotional=20 state presents the peculiarity to be idiosyncratic, that is private to=20 every individual, it does not also go away of the associated reaction=20 which shows itself by the gesture (movement, posture, face), the sound=20 (voice, music), and which, it is observable by others. That is why paradigm of analysis-transformation-synthesis of the=20 emotional reactions grows on into the therapeutic, commercial,=20 scientific and artistic domains. This thesis joins in these last two=20 domains and proposes several contributions. From a theoretical point of=20 view, this thesis proposes a definition of the expressivity, a=20 definition of the neutral expressivity, a new representation mode of the=20 expressivity, as well as a set of expressive categories common to the=20 speech and to the music. It places the expressivity among the census of=20 the available levels of information in the performance which can be seen=20 as amodel of the artistic performance. It proposes an original model of=20 the speech and its constituents, as well as a new hierarchical prosodic=20 model. From an experimental point of view, this thesis supplies a protocol for=20 the acquisition of performed expressive data. Collaterally, it makes=20 available three corpora for the observation of the expressivity. It=20 supplies a new statistical measure of the degree of articulation as well=20 as several analysis results concerning the influence of the expressivity=20 on the speech. From a technical point of view, it proposes a speech processing=20 algorithm allowing the modification of the degree of articulation. It=20 presents an innovative database management system which is used,=20 already, by some other automatic speech processing applications,=20 requiring the manipulation of corpus. It shows the establishment of a=20 bayesian network as generative model of context dependent transformation=20 parameters. From a technological point of view, an experimental system of high=20 quality transformation of the expressivity of a French neutral=20 utterance, either synthetic or recorded, has been produced, as well as a=20 non-line interface for perceptive tests. Finally and especially, from a forward-looking point of view, this=20 thesis proposes various research tracks for the future, both on the=20 theoretical, experimental, technical, and technological aspects. Among these, the confrontation of the demonstrations of the expressivity=20 in the speech and in the musical performance seems to be a promising way. *Keywords * Emotions, expressivity, artistic performance, musical performance,=20 speech, prosody, speech signal transformation, generative model, machine=20 learning, bayesian network. --------------090701010205070607060300 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> </head> <body style="" bgcolor="#ffffff" text="#000000"> <div>English bellow&nbsp; [Apologies if you receive multiple copies]<br> ____________________________________________________________________________________________________<br> </div> <span class="Apple-style-span" style="color: rgb(0, 0, 255); font-family: DIN;"><small><small><font class="Apple-style-span" size="6"><small><small></small></small></font></small></small><span class="Apple-style-span" style="color: rgb(0, 0, 0);"></span></span><font color="#000000" face="Times New Roman, Times, serif"><span class="Apple-style-span" style="color: rgb(0, 0, 255);"><span class="Apple-style-span" style="color: rgb(0, 0, 0);"><font class="Apple-style-span"><font class="Apple-style-span"><font class="Apple-style-span" size="5"></font></font></font></span></span><br> </font> <div> <div><font class="Apple-style-span" color="#000000" face="Times New Roman, Times, serif" size="5"><span class="Apple-style-span" style="font-size: 18px;">Gr&eacute;gory BELLER</span></font><font class="Apple-style-span" color="#000000" face="Times New Roman, Times, serif" size="4">&nbsp;soutiendra publiquement sa th&egrave;se de doctorat IRCAM-Paris VI:<br> <br> "Analyse et mod&egrave;le g&eacute;n&eacute;ratif de l'expressivit&eacute;. Application &agrave; la parole et &agrave; l'interpr&eacute;tation musicale"<br> <br> </font></div> <div> <div> <div><font class="Apple-style-span" color="#000000" face="Times New Roman, Times, serif"><font class="Apple-style-span" size="4">MERCREDI 24 JUIN 2009 &agrave; 14H00,&nbsp;</font></font><font class="Apple-style-span" color="#000000" face="Times New Roman, Times, serif"><font class="Apple-style-span" size="4">en salle Stravinsky, Ircam</font></font><font class="Apple-style-span" color="#000000" face="Times New Roman, Times, serif"><font class="Apple-style-span" size="4"> <br> IRCAM, 1 pl. Igor Stravinsky, Paris</font></font><font class="Apple-style-span" color="#000000"><font class="Apple-style-span" size="4"><b><br> </b></font></font></div> </div> </div> <div><font class="Apple-style-span" size="4"><b><br> </b></font><font class="Apple-style-span" color="#000000" face="Times New Roman, Times, serif" size="4">La th&egrave;se est dirig&eacute;e par Xavier Rodet, et r&eacute;alis&eacute;e &agrave; l'Ircam au sein de l'&eacute;quipe Analyse et synth&egrave;se des sons </font><br> <font class="Apple-style-span" size="4"><b><br> </b></font></div> <div>La soutenance de th&egrave;se se fera devant un jury compos&eacute; de :</div> <div> <div>G&eacute;rard Bailly <span class="Apple-tab-span" style="white-space: pre;"> </span>rapporteur GIPSA-lab</div> <div>Christophe D'Alessandro <span class="Apple-tab-span" style="white-space: pre;"> </span>examinateur LIMSI-CNRS</div> <div>Laurence Devillers <span class="Apple-tab-span" style="white-space: pre;"> </span>rapporteure LIMSI-CNRS</div> <div>Thierry Dutoit <span class="Apple-tab-span" style="white-space: pre;"> </span>examinateur TCTS</div> <div>Axel Roebel <span class="Apple-tab-span" style="white-space: pre;"> </span>examinateur Ircam</div> <div>Xavier Rodet <span class="Apple-tab-span" style="white-space: pre;"> </span>directeur de th&egrave;se, Ircam</div> <div>Jean-Luc Zarader <span class="Apple-tab-span" style="white-space: pre;"> </span>examinateur ISIR</div> </div> <div><br> </div> </div> <div><b>La soutenance se fera en Fran&ccedil;ais. Elle est publique et vous &ecirc;tes les bienvenus dans la limite des places disponibles. Elle sera suivie d'un pot, sur place. Pour ceux qui ne pourraient se d&eacute;placer, elle sera retransmise &agrave; l'adresse suivante: <a class="moz-txt-link-freetext" href="http://video.ircam.fr/">http://video.ircam.fr/</a><br> <br> </b></div> <div><b>R&eacute;sum&eacute;</b></div> <div>Cette th&egrave;se s'inscrit dans les recherches actuelles sur les &eacute;motions et les r&eacute;actions &eacute;motionnelles, sur la mod&eacute;lisation et la transformation de la parole, ainsi que sur l'interpr&eacute;tation musicale. Il semble que la capacit&eacute; d'exprimer, de simuler et d'identifier des &eacute;motions, des humeurs, des intentions ou des attitudes, soit fondamentale dans la communication humaine. La facilit&eacute; avec laquelle nous comprenons l'&eacute;tat d'un personnage, &agrave; partir de la seule observation du comportement des acteurs et des sons qu'ils &eacute;mettent, montre que cette source d'information est essentielle et, parfois m&ecirc;me, suffisante dans nos relations sociales. Si l'&eacute;tat &eacute;motionnel pr&eacute;sente la particularit&eacute; d'&ecirc;tre idiosyncrasique, c'est-&agrave;-dire particulier &agrave; chaque individu, il n'en va pas de m&ecirc;me de la r&eacute;action associ&eacute;e qui se manifeste par le geste (mouvement, posture, visage...), le son (voix, musique...), et qui, elle, est observable par autrui. Ce qui nous permet de penser qu'il est possible de transformer cette r&eacute;action dans le but de modifier la perception de l'&eacute;motion associ&eacute;e.</div> <div><br> </div> <div>C'est pourquoi le paradigme d'analyse-transformation-synth&egrave;se des r&eacute;actions &eacute;motionnelles est, peu &agrave; peu, introduit dans les domaines th&eacute;rapeutique, commercial, scientifique et artistique. Cette th&egrave;se s'inscrit dans ces deux derniers domaines et propose plusieurs contributions.</div> <div><br> </div> <div>D'un point de vue th&eacute;orique, cette th&egrave;se propose une d&eacute;finition de l'expressivit&eacute;, une d&eacute;finition de l'expressivit&eacute; neutre, un nouveau mode de repr&eacute;sentation de l'expressivit&eacute;, ainsi qu'un ensemble de cat&eacute;gories expressives communes &agrave; la parole et &agrave; la musique. Elle situe l'expressivit&eacute; parmi le recensement des niveaux d'information disponibles dans l'interpr&eacute;tation qui peut &ecirc;tre vu comme un mod&egrave;le de la performance artistique. Elle propose un mod&egrave;le original de la parole et de ses constituants, ainsi qu'un nouveau mod&egrave;le prosodique hi&eacute;rarchique.</div> <div><br> </div> <div>D'un point de vue exp&eacute;rimental, cette th&egrave;se fournit un protocole pour l'acquisition de donn&eacute;es expressives interpr&eacute;t&eacute;es. Colat&eacute;ralement, elle rend disponible trois corpus pour l'observation de l'expressivit&eacute;. Elle fournit une nouvelle mesure statistique du degr&eacute; d'articulation ainsi que plusieurs r&eacute;sultats d'analyses concernant l'influence de l'expressivit&eacute; sur la parole.</div> <div><br> </div> <div>D'un point de vue technique, elle propose un algorithme de traitement du signal permettant la modification du degr&eacute; d'articulation. Elle pr&eacute;sente un syst&egrave;me de gestion de corpus novateur qui est, d'ores et d&eacute;j&agrave;, utilis&eacute; par d'autres applications du traitement automatique de la parole, n&eacute;cessitant la manipulation de corpus. Elle montre l'&eacute;tablissement d'un r&eacute;seau bay&eacute;sien en tant que mod&egrave;le g&eacute;n&eacute;ratif de param&egrave;tres de transformation d&eacute;pendants du contexte.</div> <div><br> </div> <div>D'un point de vue technologique, un syst&egrave;me exp&eacute;rimental de transformation, de haute qualit&eacute;, de l'expressivit&eacute; d'une phrase neutre, en fran&ccedil;ais, synth&eacute;tique ou enregistr&eacute;e, a &eacute;t&eacute; produit.</div> <div><br> </div> <div>Enfin et surtout, d'un point de vue prospectif, cette th&egrave;se propose diff&eacute;rentes pistes de recherche pour l'avenir, tant sur les plans th&eacute;orique, exp&eacute;rimental, technique, que technologique. Parmi celles-ci, la confrontation des manifestations de l'expressivit&eacute; dans les interpr&eacute;tations verbale et musicale semble &ecirc;tre une voie prometteuse.</div> <div><br> </div> <div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 11px;"><i>Mots-cl&eacute;s</i></span></font></div> <div><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 11px;"><i>&Eacute;motions, expressivit&eacute;, performance artistique, interpr&eacute;tation musicale, parole, prosodie, transformation du signal de parole, mod&eacute;lisation g&eacute;n&eacute;rative, apprentissage, r&eacute;seau bay&eacute;sien. <br> <br> _____________________________________________________________________________________________________________________________________<br> </i></span></font> <p class="news_teaser">Ph.D. Defense:</p> Gr&eacute;gory Beller, IRCAM-Paris VI<br> <h3>Analysis and Generative Model of the Expressivity. Application in the Speech and in the Musical Performance.</h3> <p class="bodytext">WEDNESDAY, JUNE 24TH, 2009 at 2:00 pm, in room Stravinsky, Ircam <br> IRCAM, 1 pl. Igor Stravinsky, Paris, France<br> </p> <p class="bodytext">Ph.D. Supervisor: Xavier Rodet (IRCAM)<br> <b><br> Ph.D. Examining Board<br> G&eacute;rard Bailly</b> rapporteur GIPSA-lab <br> <b>Christophe D'Alessandro</b> examiner LIMSI-CNRS <br> <b>Laurence Devillers</b> rapporteure LIMSI-CNRS <br> <b>Thierry Dutoit</b> examiner TCTS <br> <b>Axel Roebel</b> examiner IRCAM <br> <b>Xavier Rodet</b> supervisor IRCAM <br> <b>Jean-Luc Zarader</b> examiner ISIR<br> </p> <p class="bodytext">The defense is public and will be made in French. It will be followed by a drink, on the spot. For those who could not move, it will be broadcast at the following address: <a class="moz-txt-link-freetext" href="http://video.ircam.fr/">http://video.ircam.fr/</a><br> </p> <h3>Abstract</h3> <p class="bodytext">This thesis joins in the current searches (researches) on the feelings and the emotional reactions, on the modelling and the transformation of the speech, as well as on the musical performance. It seems that the capacity to express, to feign and to identify emotions, humors, intentions or attitudes, is fundamental in the human communication. The ease with which we understand the state of a character, from the only observation of the behavior of the actors and the sounds which the yutter, shows that this source of information is essential and, sometimes, su&#64259;cient in our social relationships. If the emotional state presents the peculiarity to be idiosyncratic, that is private to every individual, it does not also go away of the associated reaction which shows itself by the gesture (movement, posture, face), the sound (voice, music), and which, it is observable by others. </p> <p class="bodytext">That is why paradigm of analysis-transformation-synthesis of the emotional reactions grows on into the therapeutic, commercial, scienti&#64257;c and artistic domains. This thesis joins in these last two domains and proposes several contributions. From a theoretical point of view, this thesis proposes a de&#64257;nition of the expressivity, a de&#64257;nition of the neutral expressivity, a new representation mode of the expressivity, as well as a set of expressive categories common to the speech and to the music. It places the expressivity among the census of the available levels of information in the performance which can be seen as amodel of the artistic performance. It proposes an original model of the speech and its constituents, as well as a new hierarchical prosodic model. </p> <p class="bodytext">From an experimental point of view, this thesis supplies a protocol for the acquisition of performed expressive data. Collaterally, it makes available three corpora for the observation of the expressivity. It supplies a new statistical measure of the degree of articulation as well as several analysis results concerning the in&#64258;uence of the expressivity on the speech. </p> <p class="bodytext">From a technical point of view, it proposes a speech processing algorithm allowing the modi&#64257;cation of the degree of articulation. It presents an innovative database management system which is used, already, by some other automatic speech processing applications, requiring the manipulation of corpus. It shows the establishment of a bayesian network as generative model of context dependent transformation parameters. </p> <p class="bodytext">From a technological point of view, an experimental system of high quality transformation of the expressivity of a French neutral utterance, either synthetic or recorded, has been produced, as well as a non-line interface for perceptive tests. </p> <p class="bodytext">Finally and especially, from a forward-looking point of view, this thesis proposes various research tracks for the future, both on the theoretical, experimental, technical, and technological aspects. </p> <p class="bodytext">Among these, the confrontation of the demonstrations of the expressivity in the speech and in the musical performance seems to be a promising way. </p> <p class="bodytext"><b>Keywords </b><br> Emotions, expressivity, artistic performance, musical performance, speech, prosody, speech signal transformation, generative model, machine learning, bayesian network.</p> </div> </body> </html> --------------090701010205070607060300--


This message came from the mail archive
http://www.auditory.org/postings/2009/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University