Subject: From: "(Yoshitaka Nakajima)" <nakajima(at)KYUSHU-ID.AC.JP> Date: Mon, 21 Sep 1992 15:41:58 JSTDear Al and Dear colleagues, I think to debug experiments in this network is far better than to be debugged after submitting a paper. Some people may abuse the system, but to have several different systems for scientific discussion, e.g., journals, conferences and e-mail networks, is a good way to avoid serious abuses of any system. I agree to Dr. Hartmann's proposal that we have to have a certain standard for claim staking. We can discuss the problem, maybe, after gathering some examples of claim staking. Al's research plan seems interesting. I think it is very important to investigate the relationship between primitive organization and speech perception. Although I am almost a layman in the field of speech perception, I am inclined to think combining spectral components to perceive speech and to hear out a component are two rather independent things which can be performed simultaneously without serious conflict. My opinion is based on an observation performed by my students and me: We made three spectral envelopes simulating Japanese vowels /a/, /i/ and /u/. Inharmonic spectral components at intervals of 500 cents glided in the same direction (upward or downward) for 500 cents taking 400 ms. We gave a rise- and a fall-time of 10 ms to each stimulus tone. The question, related to the present topic, was, whether we could perceive the vowels or not. I must say it was sometimes difficult to perceive any vowels. But finally, we learned how to perceive vowels within these peculiar stimuli. Some people could perceive the vowels immediately. When two different stimuli, e.g., /a/ and /i/, were presented successively, the vowels were rather clear. An interesting thing was that, all of us had an impression that these vowels were spoken by several speakers. In my case, about six male speakers seemed to utter the vowels. Now, the perception of vowels can take place only when we can integrate several spectral components perceptually. On the other hand, perceiving several voices means we are separating inharmonic components somehow. This observation suggests that speech perception takes place even when spectral components are segregated in a primitive aspect. There is, however, another possible explanation: The listeners may have picked out just the formants to perceive vowels, and the rest of the components may have contributed to increase the number of voices. But, then, it would be difficult that the voices are heard as male voices. Reducing the number of components would give us a clearer view, and I am planning to do some more observations with one of my students. Yoshitaka Nakajima