PhD position in diagnostic method for voice quality at Orange Labs in Lannion (France)

Campaign 2010
Description of the PHD

Orange Labs Supervisor: Vincent Barriac
Supervisor email: Vincent.barriac@xxxxxxxx
Location: Lannion (France)
PHD title: Development of a technical diagnostic method for voice quality impairments perceived in telephone communications, based on an analysis of speech signal. Global context and state of the art The assessment of perceived quality of voice communications can be performed thanks to two rather distinct families of tools: - Signal analysis techniques, from simple measures like signal level or spectrum until complex =93psycho-acoustical=94 models combining si= gnal analysis and modelling of human perception and judgment (like PESQ, ITU-T P.862). - Parametric techniques based on an interpretation of technical factors not linked to the signal itself, but rather to the way it has been processed and transported inside the network. Inside these two families, methods have been recently developed, which are particularly accurate, allowing pertinent prediction and estimation of perceived voice quality. More recently, new approaches have been developed, called =93hybrid=94 beca= use they combine measurement on signal and parametric indications, in particular in the context of voice over IP. The complementarities of signal-based and parametric families of methods make it possible (in theory) to envisage a combination of their respective advantages: accuracy for signal based techniques, and capacity of parametric tools to be implemented without constraint on CPU or on signal decoding. Furthermore, parametric methods bring elements of understanding about the technical underlying causes (e.g. packet losses may explain cuts in the signal). But all these methods have a common drawback: they do not allow a link between the perceived impairments and their origins. Some academic studies can be quoted on this, but without real result until now. From an operational point of view, this is however the real goal of any assessment technique to find the causes for issues and propose fixes. PHD objectives / Expected results / Scientific challenges / Key Issues The basic idea behind this new study is that it is now realistic to envisage providing operational supervision teams with powerful diagnostic tools able to give them an expert view of the perceived voice quality impairments on telephone communications and to troubleshoot these impairments deep in detail. The objective of this study is therefore the development of such a tool, combining analysis of the audio signal and interpretation of parametric data. This study will specifically focus on VoIP services and architectures. These are based on IMS solutions (SIP protocol) provided by a few technology vendors to France Telecom/Orange. The extrapolation of the results of this study to general (and even standardisable) rules of diagnostic (in general highly dependent on specific characteristics of services and networks) is not easy to foresee, and therefore we won=92t try to work on that direction. This work will be undertaken in close cooperation with (in a first time) our searchers specialised in the development of algorithms and models for voice signal processing (voice quality measurement, speech coding, voice enhancements) and (afterwards) with operational teams having the knowledge of network equipments characteristics and able to provide data on real incidents necessary to set up diagnostic rules. Methodological approach proposed by the supervisor We foresee two steps: - detection in speech signal of perceived and annoying degradations, classified in general categories: o cuts in the signal, loss of information o distortion of the audio signal o different types of noises o signal level modifications o various impairments linked with interaction issues (e.g. echo) - determination of more detailed sub-categories (e.g. for noise: distinction according to spectral content and level), linked with known and identified technical causes The first step is clearly and purely signal processing oriented. We must mention that recent PhD. studies (e.g. M. W=E4ltermann at DT, N. C=F4t=E9 a= nd A. Leman at FT) started this work and already determined degradation categories (for listening-only contexts), as well as first (still perfectible) detection algorithms. The second step is more the focus of the current study. It will combine the existing algorithms (or enhancements of them) with the analysis of IP parametric information (packet loss ratio and its time repartition, network equipments counters or trouble tickets, measurements performed on terminals, etc.) Global schedule The time schedule will follow the distinction in two steps exposed before: - Enhancement of existing algorithms, to allow detection of sub-categories as well as =93recognition=94 of the signature of some signal processing features (noise reduction, coding and transcoding, etc.). This is the hardest and longest part of the study. - Setting up of diagnostic rules to link these new sub-categories to real technical issues, thanks to a combination of measurements on signal and of parametric data. An expert system based on neuronal networks is foreseen, but other approaches can be envisaged as well. Campagne 2010
Fiche descriptive de la thèse

Encadrant Orange Labs: Vincent Barriac
Adresse électronique de l'encadrant : Vincent.barriac@xxxxxxxx
Site: Lannion
Sujet de la thèse (Titre): Développement d'une méthode de diagnostic technique des dégradations de qualité vocale perçue des communications téléphoniques à partir d'une analyse du signal de parole

Contexte global de l'étude et état de l'art

L'évaluation de la qualité perçue des communications vocales fait appel à deux familles de techniques assez distinctes :
- Les techniques d'analyse du signal, depuis des mesures simples sur le niveau de signal ou son spectre, jusqu'à des modèles dits psycho-acoustiques, c'est-à-dire combinant analyse du signal et modélisation de la perception et du jugement. Le plus connu de ces mod=E8les est PESQ (UIT-T P.862). - Les techniques d=92interpr=E9tation des indicateurs techniques n= on li=E9s au signal lui-m=EAme, mais plut=F4t =E0 la fa=E7on dont il a =E9t=E9= trait=E9 et transport=E9 par le r=E9seau. On parle alors de m=E9thodes param=E9triques. Dans ces deux familles on a abouti =E0 des m=E9thodes particuli=E8rement pr= =E9cises, permettant d=E9sormais de pr=E9dire ou d=92estimer la qualit=E9 per=E7ue de= fa=E7on pertinente. Plus r=E9cemment, on a vu appara=EEtre des approches hybrides, combinant me= sures sur le signal et indications param=E9triques, notamment dans le domaine du transport sur IP. La compl=E9mentarit=E9 des deux approches doit permettre = de combiner les avantages des deux familles : la pr=E9cision des mesures sur le signal, et la capacit=E9 des m=E9thodes param=E9triques =E0 =EAtre utilis= =E9es sans contrainte de CPU ou de d=E9codage du signal. Les m=E9thodes param=E9triques apportent de plus des =E9l=E9ments de compr=E9hension de d=E9fauts techniqu= es (par exemple, une mesure de pertes de paquets pour comprendre des coupures dans le signal). Toutes ces m=E9thodes ont cependant un d=E9faut : elles ne permettent pas de faire le lien entre une d=E9gradation per=E7ue et sa cause. Quelques =E9tud= es ont commenc=E9 =E0 aborder ce lien, mais elles en sont encore =E0 un stade peu = avanc=E9. Or, d=92un point de vue op=E9rationnel, c=92est =E0 cela que doivent avant = tout servir des m=E9thodes et outils de mesure de qualit=E9 : trouver l=92origin= e des d=E9fauts constat=E9s, voire m=EAme proposer des actions correctives. Objectifs de la th=E8se/ R=E9sultats attendus/ D=E9fis scientifiques/techni= ques =E0 relever. L=92id=E9e =E0 la base du lancement de cette =E9tude est qu=92il est possib= le et r=E9aliste d=92envisager de doter les =E9quipes op=E9rationnelles en charge= de superviser les r=E9seaux et services de t=E9l=E9communications d=92outils de diagnostic puissants capables d=92expertiser les d=E9fauts de qualit=E9 per= =E7ue sur des communications t=E9l=E9phoniques et d=92en d=E9duire les causes techniq= ues sous-jacentes (et a fortiori les solutions =E0 apporter). L=92objectif est donc la r=E9alisation d=92un tel outil, combinant analyse = du signal audio et interpr=E9tation de donn=E9es param=E9triques. Cette =E9tude sera restreinte aux architectures des r=E9seaux VoIP de France T=E9l=E9com / Orange, bas=E9es sur IMS (protocole SIP), et aux seuls fourni= sseurs de France T=E9l=E9com / Orange de ces architectures. . L=92extrapolation de= ces travaux vers une g=E9n=E9ralisation des r=E8gles de diagnostic (qui seront fortement d=E9pendantes d'architectures particuli=E8res) ou la normalisatio= n ne nous semble donc pas ais=E9ment envisageable (et pas non plus forc=E9ment souhaitable). Ce travail va =EAtre r=E9alis=E9 en =E9troite collaboration avec (premi=E8r= e partie) les chercheurs en charge du d=E9veloppement d=92algorithmes et mod=E8les de traitement du signal vocal (mesure de qualit=E9 vocale, codage de parole, am=E9lioration du signal), mais aussi (seconde partie) avec les =E9quipes op=E9rationnelles connaissant les =E9quipements du r=E9seau et en mesure de fournir des donn=E9es r=E9elles d=92incidents pour =E9laborer des r=E8gles = de diagnostic. Approche m=E9thodologique propos=E9e par le responsable technique (Pr=E9ciser les comp=E9tences recherch=E9es n=E9cessaires =E0 l'approche) L=92approche que nous privil=E9gions comporte deux =E9tapes : - la d=E9tection dans le signal de parole de d=E9gradations perceptibles et g=EAnantes, parmi des cat=E9gories g=E9n=E9rales : o coupures dans le signal, pertes d=92information o distorsion, d=E9formation du signal o pr=E9sence de bruit de fond o modification du niveau de signal o d=E9fauts li=E9s =E0 des difficult=E9s d=92interaction (=E9cho, no= tamment) - la d=E9termination de sous-cat=E9gories plus pr=E9cises (par exe= mple pour le bruit : distinction du type et de l=92amplitude du bruit), li=E9es = =E0 des causes techniques pr=E9visibles identifi=E9es. La premi=E8re =E9tape est purement ax=E9e sur du traitement de signal. Il f= aut mentionner que des travaux r=E9cents (th=E8ses de M. W=E4ltermann =E0 DT, d= e N. C=F4t=E9 et d=92A. Leman =E0 FT) ont bien d=E9broussaill=E9 le terrain (dans le cont= exte d=92=E9coute, pas en contexte conversationnel, qui reste encore =E0 =E9tudi= er), puisque les principales dimensions sont connues et que des algorithmes de d=E9tection (perfectibles) ont =E9t=E9 d=E9velopp=E9s. - La seconde =E9tape, qui constitue le sujet de cette =E9tude, va combiner ces algorithmes (ou plut=F4t des raffinements de ces algorithmes) = et l=92analyse d=92informations de type param=E9trique IP (taux de pertes de p= aquets et leur r=E9partition dans le temps, =E9v=E9nements sur =E9quipement du r= =E9seau et accessibles via des compteurs ou des CDR, r=E9sultats de mesures effectu=E9= s par les terminaux et renvoy=E9s par eux sur le r=E9seau, etc.). Planning Global du d=E9roulement de la th=E8se (*grandes lignes*) La th=E8se propos=E9e se d=E9compose donc en deux sous-parties bien distin= ctes : - L=92am=E9lioration des algorithmes existants, pour leur permettr= e de d=E9tecter des sous-cat=E9gories ainsi que d=92=EAtre capables de =AB recon= na=EEtre =BB la signature de certains traitements (notamment : d=E9bruitage, codage et transcodage). Il s=92agit de la partie la plus ardue et longue de cette =E9= tude. - L=92=E9tablissement de r=E8gles de diagnostic permettant de reli= er ces sous-cat=E9gories =E0 des d=E9fauts r=E9els, gr=E2ce =E0 une combinaison de= mesures sur le signal et d=92informations param=E9triques. Un syst=E8me expert bas=E9 s= ur des r=E9seaux de neurones est envisag=E9, mais d=92autres solutions peuvent =EA= tre imagin=E9es. Contributions secondaires si prévues (participation à des projets collaboratifs)
Aucune pour l'instant 