intrarater reliability - anchoring stimuli (Christel de Bruijn )

Subject: intrarater reliability - anchoring stimuli
From:    Christel de Bruijn  <christel(at)LARYNX.SHEF.AC.UK>
Date:    Tue, 1 Apr 2003 21:20:07 +0100

Dear list, First of all apologies for the long message, but I will really appreciate any comments or thoughts on my 2 questions below. I was wondering if anybody could give me some advice on two problems, the first one related to the amount of data duplication needed to achieve a good estimate of intrarater reliability in a perceptual experiment, and the secone one related to the amount of training stimuli needed in the anchoring/training phase of the experiment. I am about to start some tests on the perception of voice quality. A panel of 6 expert listeners (i.e. voice therapists) will be asked to rate the voice quality of a number of speech fragments on 12 - 15 perceptual parameters (e.g. roughness, breathiness etc.). The parameters are rated on a 5 point equal appearing interval scale. The speech fragments consist of 156 sustained vowels (divided into 3 groups of different vowels), 52 fragments of conversational speech and 52 fragments of the Rainbow passage. The 3 different types of speech fragments will be presented in separate listening sessions. In order to calculate intrarater reliability (i.e. the self-consistency of the listener) , I need to duplicate some of the stimuli. The best way to do this, is to duplicate all the stimuli. However, given the large amount of speech material in the tests, this will be very impractical. (The listeners will not be prepared to sit through 12 hours or so of testing). The literature provides little guidance as to the minimal amount of stimuli that should be duplicated in order to achieve an accuarate reliability coefficient. Some studies report a duplication of 10% or less, some 30%, a few 50% and the very odd study duplicates 100%. But never are any justifications given for the chosen percentage. (I must admit I haven't decided yet on which statistic to use for the reliability, but Pearson's r and intraclass correlation coefficients seem to be widely used) Therefore, my first question is: Given the large amount of speech material, what should be the minimal amount of data to be duplicated? (A complicating factor is also the use of conversation fragments. Listeners will probably be aware of the duplication, if only because of conversation content, and may remember their scores for that particular fragment) -------------------------------------- The second question is related to the anchoring phase of the experiment. It is common practice to provide listeners with anchoring stimuli before the actual listening test. Usually the listeners are provided with explicit anchors, i.e. the speech fragment is presented together with the perceptual rating for a particular parameter. In my experiment however, I have decided against the use of explicit anchors, in order to avoid the introduction of a bias. (This is done because the perceptual labels will become the baseline for acoustic correlates). Instead, listeners will be presented with a random selection of stimuli (which should include all values of the scale, including extremes), and are supposed to create their own anchors on the basis of these stimuli. Again, very little information is available on how people reach the decision on the number of anchoring samples. It's actually more a stats problem. My question is: If I have a set of 156 vowels and each vowel is rated on 12 parameters on a scale from 0 - 4, how many vowels should be in my training set, so that I can say with 95% probability that the scale begin- and end values (i.e 0 and 4) for each parameter are included in that set? Apologies once again for the lengthy e-mail, and sincere thanks to those to took the trouble to read until the end. Any thoughts and comments will be very greatfully received! Christel de Bruijn Christel de Bruijn - PhD student University of Sheffield Department of Human Communication Sciences 31 Claremont Crescent Sheffield S10 2TA United Kingdom phone: (+44) (0)114 22 22410 fax: (+44) (0)114 27 30547 e-mail: christel(at)

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University