[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Usual settings for the transformed up-down procedure
Hi Massimo,
We use up to 16 reversals (always an even number). We halve the stepsize
(not 1/sqrt(2), but 1/2) after reversal two, and a second time after
reversal 4. This corresponds to a sort of simplified and truncated
Robbins-Monro (RM) procedure. In a true RM, the step size would be
something like 1/n, where n is the trial number. In computer simulations
(don't ask me the references), RM has been shown to be kind of optimal
for staircase step size rules. Somewhere it has also been shown that
1/(2^r) (where r is the number of downwards reversals) is nearly as good
as 1/n, so we prefer it because it is simpler. We stop reducing the step
size after r=2 (4th reversal = 2nd downward reversal) because in
computer simulations the psychometric function is assumed to be constant
over time, while in real life it will fluctuate.
I know of quite extensive comparisons of the different methods to
evaluate the means (one of the authors was Birger Kollmeier, it was in
the eighties in JASA, if I remember correctly). There were only slight
differences between those methods. My personal résumé was the one should
average all levels (not only the reversal points), starting with the
level of the trial following reversal 4, and including the post-final
level which would have been presented next if the procedure had not
stopped after, say, 12 or 16 reversals.
Most important point :-) We don't use transformed up-down. We use
weighted up-down (Kaernbach, 1991). It corresponds to a one-up one-down
"transformed" rule, but with different step sizes. It is more efficient
than transformed up-down, and more flexible. You can have any target
performance you want: 75%, 76%, 77%... Transformed up-down is limited to
71% (two-step) versus 79% (three-step), with nothing in-between. Ideally
you would want 75%, which results in a very simple weighted up-down
procedure: -1 db after a correct response, +3db after an incorrect
response.
And that's not the final move. Now we use a variant of weighted up-down,
"unforced weighted up-down" (Kaernbach, 2000), which is to my knowledge
the only adaptive psychophysical procedure to include "I don't know"
responses. We could show that the inclusion of this type of response
does not impair the validity, while improving the efficiency of the
procedure and (most important) the comfort of the participants (at least
for naive participants).
Best,
Chris
References:
Kaernbach, C. (1991).
Simple adaptive testing with the weighted up-down method,
Perception & Psychophysics 49, 227-229.
http://www.uni-kiel.de/psychologie/emotion/team/kaernbach/publications/1991a_kae_p&p.pdf
(There is a typo in Eq. 1 and the line that follows: Sup and Sdown got
interchanged.)
Kaernbach, C. (2001).
Adaptive threshold estimation with unforced-choice tasks,
Perception & Psychophysics 63, 1377-1388.
http://www.uni-kiel.de/psychologie/emotion/team/kaernbach/publications/2001a_kae_p&p.pdf
Dear list members (and transformed up-down user in particular),
we are extending our MLP software
(http://www.psy.unipd.it/~grassi/mlp.html) to the transformed up-down
adaptive procedure so the user of our software can decide to run a given
threshold-experiment with either the maximum likelihood procedure or the
transformed up-down (wow! what a fun!).
We are now working on the software and facing the first problem: what
are the "traditional" settings for the transformed up-down?
In brief:
- how many "downs"?
- how many reversals before stopping the procedure?
- how many reversal with a small step-size and how many with a large one?
- how many step-sizes?
- what's the (is there any) "usual" ratio between small and large step-size?
- how to calculate the subject's threshold?
Here the traditional rules base on my own reading:
- Number of "downs". I found only studies using the 2(or 3)-down 1-up.
No study using the 4-down (or higher) 1-up rule. Furthermore, no study
using the 1-down, 2/3/4...-up.
- Reversals and step-size. So far I found that people tend to use two
step sizes only (e.g., factor 2 and factor 2^(1/2), i.e., sqrt(2)) and
to run 4 reversals with the large step size and 8 (or 12) reversals with
the small step size.
- For the threshold calculation I found that it is usually calculated on
the last 8 (or 12) reversals points of a given block. Moreover, I found
that people calculate indifferently arithmetic, geometric mean or median.
I'm particularly interested in exceptions to the "traditional rules" I
found: we want to include these exceptions to our software.
Thank you in advance for the help you can provide,
m