5pSC16. TEMAS: Visualization of temporal variation in spontaneous speech.

Session: Friday Afternoon, December 6

Time: 5:50


Author: Shigeyoshi Kitazawa
Location: Dept. of Comput. Sci., Faculty of Information, Shizuoka Univ., 5-1, 3-Chome, Jouhoku, Hamamatsu, 432 Japan
Author: Hideya Ichikawa
Location: Dept. of Comput. Sci., Faculty of Information, Shizuoka Univ., 5-1, 3-Chome, Jouhoku, Hamamatsu, 432 Japan
Author: Satoshi Kobayashi
Location: Dept. of Comput. Sci., Faculty of Information, Shizuoka Univ., 5-1, 3-Chome, Jouhoku, Hamamatsu, 432 Japan

Abstract:

This study presents a new method for analyzing speech rhythm. First, speech speed is measured and displayed on the basis of syllable (or mora) and stress (rhythmic foot). Then, this specific algorithm TEMAX (temporal evaluation and measurement algorithm by KS) is applied to the speech envelope sampled at 40 Hz; speech wave is half-wave rectified and low-pass filtered at 20 Hz. The DFT of the envelope using a 1-s window is convenient to set off isosyllabic characteristics. For Japanese, the TEMAX-gram, a sonagraphic output, traces two dark bars, called rhythmic formants: RF1 and RF2: the first one, around 8 Hz, and the second one, at about halfway. RF1 corresponds to speech rate, which appears almost steady in read speech and monolog but shows wide variations in spontaneous speech. RF2 represents the bimoraic rhythmic foot, that is, a combination of two adjacent moras forming a single large power peak. Considering English, its isochronic characteristics are observable with a 2-s window as RF1. Furthermore, using a 1-s window TEMAS-gram displays the periodicity of syllables between stress as broken bars around 5 to 10 Hz. This approach appears to be a promising tool for analysis of extremely different rhythmic structures.


ASA 132nd meeting - Hawaii, December 1996