Abstract:
It is well known that speech rate varies both globally and locally in natural discourse due to various factors such as contrastive stress, syntactic boundaries, emotion, etc. While the global speech rate can be clearly defined by the durations of utterances and pauses, the local speech rate has not been well defined. The present authors have proposed a rigorous and quantiative definition for the relative local speech rate and showed an objective method for its measurement [S. Ohno and H. Fujisaki, Proc. EUROSPEECH'95, Vol. 1, pp. 421--424 (1995)]. Based on the analysis of changes in both global and local speech rates found in a speech material consisting of readings of a story at various speech rates, the present paper proposes rules for controlling the global and local speech rates in order to produce a synthetic discourse to fit exactly in a specified time interval. The validity of the method has been tested and confirmed by perceptual experiments using synthetic discourse of various durations generated from a natural discourse by analysis--resynthesis.