Subject: Verbal Transformations From: Richard M Warren <rmwarren(at)CSD.UWM.EDU> Date: Fri, 23 May 1997 16:50:29 -0500Dear List: I've read with interest the various models and theories that have followed my brief description and query concerning the basis for dichotic verbal transformations, since they cannot be handled by conventional theories of speech perception. However, as Al has observed on 20 May "... none of them has addressed Dick's major observation. How could one 'binding' of properties (say the voice at the left ear) habituate independently of the other binding when it involves the same word?" Pierre on 20 May stated that "... I think that the phenomenon has not been described in sufficient detail to draw any conclusion." I agree with Pierre, and believe that speculation concerning the basis for this illusion can be sharpened if I describe some of the published quantitative aspects of this effect. But before doing so, lest it be thought that I have not given considerable thought to the problems raised by this illusion (and indeed, that thought did not precede the experiments), I would like to express my ideas concerning this and similar issues. Unlike intelligently designed systems and models, biological systems often achieve amazing results through mechanisms that are neither the most simple or elegant, but reveal themselves through experiments (consider transduction by the inner ear vs. a microphone). A second point is that there is often more than one mechanism available for perceptual analysis of sensory data. This has been brought home to me once again by our experiments on the verbal organization of repeated sequences of steady-state vowels in which the high-frequency and low-frequency components undergo independent verbal organization [Warren, Healy, & Chalikia, JASA, 1996, 100, 2452-2461]. Now for the nitty-gritty of verbal transformations. First, the basic illusion: verbal transformations occur while listening to a repeated sequence of speech sounds. The rate of illusory changes (transformations) is equivalent for monaural and diotic presentations. The stimuli can be isochronous sequences of steady-state vowels (item durations 30 - 100 ms), syllables, lexical items, phrases, or short sentences. While there is considerable individual variability, for a monosyllabic word repeated twice/sec, there are about 5 - 10 changes/min with young adults (18 - 25 yr). The rate of changes is more rapid for children (8 yr), and dramatically less for the elderly (62 - 86 yr) -- about 1/5 that of young adults. The forms heard with young adults are often neologisms and, as also reported for jargon aphasia, follow the phonotactic rules for phonemic clustering in English. In addition, the forms for young adults almost always consist of syllables actually occurring in English (question -- is this also true in jargon aphasia?). Young children occasionally violate phonotactic rules, and the elderly almost always restrict their responses to lexical items. One further finding: with young adults, reducing the repetition rate by introducing pauses between repetitions results in a proportional decrease in the rate of verbal transformations (i.e., the same number of repetitions produces the same number of changes). With this out of the way, let's consider dichotic verbal transformations reported with the same repeated word delivered to each ear twice/sec at an interaural asynchrony of exactly 1/2 of the repetition period. The independent changes to other words and syllables occur at the same rate at each ear (no right-ear advantage). The changes occur asynchronously at each ear, and the concurrent forms at any given time are generally independent of each other. When instructed to monitor changes on both sides, the combined rate of change at the two ears is much less than the rate occurring with either monaural or diotic stimulation. When listeners are instructed to monitor only one ear with the dichotic stimulus, the rate at that ear is about the same as the combined rate when both ears are monitored. One further point related to construction of an appropriate model -- it is not only habituation or satiation that occurs at each side -- there are also independent shifts in the criteria for acceptable tokens by the nodes, templates, cell assemblies, or whatever, that are responsible for the recognition of verbal forms. Changes occur when salience of a perceived form is exceeded by the salience of a new form resulting from a criterion shift [see Warren, "Criterion shift rule and perceptual homeostasis", Psych. Rev., 1985, 92, 574-584]. A final item: to address a question raised by Al, Jim Bashford has used a repeating word "calculate" (800 ms duration) highpassed and lowpassed at 1,500 Hz and delayed one version relative to the other. They were both presented diotically, and when there was an asynchrony of 100 ms or more between the two versions, as Al anticipated, independent verbal transformations were heard by each of three people in the lab. Dick