Spectral tilt and sharper peaks vs TIMIT ("J. Scott Merritt" )


Subject: Spectral tilt and sharper peaks vs TIMIT
From:    "J. Scott Merritt"  <merrij3(at)RPI.EDU>
Date:    Wed, 29 Sep 2004 17:32:54 -0400

Hi, I've recently begun some speech recognition work and would really appreciate some assistance. If this is an inappropriate location for beginner questions, alternate suggestions would be appreciated :) My current problem is the seemingly large disparity between the spectral characteristics of the TIMIT continuous speech corpus and speech samples that I've recorded locally. More specifically, the locally recorded speech seems to have more power in the upper frequencies (> 2000 Hz) and *much* sharper frequency peaks throughout the spectrum. On the low end of the spectrum, the locally recorded data has very sharp peaks at (what appears to be) the fundamental pitch (F0) and it harmonics. These are much less prominent in the TIMIT corpus. I reckon the spectral tilt could be due to differences in the frequency response of my microphone (Shure RS130), pre-amp (Rolls MP13), or other assorted components. However, I am currently at a loss to explain the "peakiness" of my data relative to that in the TIMIT corpus. Many thanks, Scott.


This message came from the mail archive
http://www.auditory.org/postings/2004/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University