Abstract:
Taiwanese, in common with other ancient Chinese dialects, has a more complex tonal structure than the well-known standard Peking Mandarin. One important feature is the notion of ``long'' and ``short'' tones. In order to build a successful computer-aided speech recognition system for Taiwanese speech, it is vital to understand this much overlooked phenomena. In a set of experiments, we set out to determine how subjects distinguish long versus short tones. A palette of long and short words was composed such that a long word would have a short counterpart and vice versa. Using a resynthesis tool, these words were gradually modified to resemble the opposite tone to study the boundaries of perception. Two main features were investigated, namely word duration and energy drop-off (or glottal stop). The experiments reveal that duration has an insignificant impact on the distinction between long and short tones, while the energy drop-off has a strong impact on the perception of long versus short tones. A word with a steep energy drop-off is more likely to be perceived as a short tone than a word with a gentle energy drop-off.