5pSC4. Evaluation of a phase-spotting method incorporated with prosodic information in spontaneous speech.

Session: Friday Afternoon, December 6

Time: 2:50


Author: Toshiyuki Hanazawa
Location: Mitsubishi Electric Corp., Information Technol. R&D Ctr., Human Media Technol. Dept., 5-1-1, Ofuna, Kamakura, Kanagawa, 247 Japan
Author: Abe Yoshiharu
Location: Mitsubishi Electric Corp., Information Technol. R&D Ctr., Human Media Technol. Dept., 5-1-1, Ofuna, Kamakura, Kanagawa, 247 Japan
Author: Kunio Nakajima
Location: Mitsubishi Electric Corp., Information Technol. R&D Ctr., Human Media Technol. Dept., 5-1-1, Ofuna, Kamakura, Kanagawa, 247 Japan

Abstract:

Spotting phrases from continuous speech is a difficult task especially from spontaneous speech. To cope with the problem, a phrase-spotting method has been proposed containing prosodic information. In this paper, evaluation results of the method are described when applying it to spontaneous speech. In this method, the prosodic likelihood of the phrase boundaries is statistically calculated based on a pitch pattern HMM network, and integrated to the spotting source of the phrase. The pitch pattern HMM network models the pitch contour of sentences by connecting pitch pattern HMMs which model pitch contour of phrases. It is thought that the use of prosodic likelihood has an effect to reduce the false alarms. The method was evaluated using the ATR spontaneous speech database. As for the pitch pattern HMMs, two types of HMMs were used with different lengths for long- and short-duration phrases and filled pauses. To construct an accurate pitch pattern HMM network, a bigram model was applied to transition the probability between the pitch pattern HMMs. Using the prosodic likelihood improved the phrase detection rate from 53% to 62%. Thus it was confirmed that the method is effective for spontaneous speech.


ASA 132nd meeting - Hawaii, December 1996