[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: within subject comparisons


Firstly, thankyou to all respondents for your comments on within subject
comparisons. There have certainly been a wide range of opinions, and
that's not even including the statisticians I have talked to!

Wherever you happen to stand on it, it seems clear that if you are doing
within subject comparisons the independence assumption is maintained
provided you accept that the nature of the inference is different (as
Pallier put it). Can we assume that we are sampling *randomly* (and thus
independently) from a particular subject's population of responses?
Michael Kubovy believes you can, provided you implement certain
counterbalancing techniques to prevent autocorrelations. Can we ever
assume absolute independence between the responses of a single subject,
even if stringent counterbalancing is employed? I doubt it, but maybe we
don't need it to be absolute (although some statistical purists will
argue otherwise).

Al Bregman has questioned the usefulness of such comparisons because the
subject's own population of responses is the only justifiable
statistical generalization. Al points out that if you provide the reader
with error bars and descriptive statistics then common sense should do
the rest. I think this may a reasonable approach, but I think it could
only be reasonable if the estimates of the variance that are used to
generate standard error are accurate (which requires independence). If
the data are highly autocorrelated then the variance will be
underestimated, the error bars will be misleading, and any conclusions
that the reader might draw from them based on common sense may be

A good example of this is can be found in adaptive staircase techniques.
A threshold point is often taken as the mean of a prescribed number of
reversals. The variance of this mean is not very useful as an estimate
of the standard error of the threshold because of the high
autocorrelation between the stimulus level at each turnaround.
Typically, the variance within a staircase will underestimate the
standard error of the threshold across staircases. If, for some reason,
we were unaware of this correlation and reported the standard error of
the threshold based on this variance we would be overstating the
magnitude of the effect and the reader, equipped with only the
descriptive statistics, will be misled.

I think it is important to show statistical significance of
within-subject comparisons because it should force us to examine the
level of autocorrelation or other dependencies that may be lurking in
the data. Even if no violations are present then the p value and
particularly estimates of effect size and power are still useful in
telling us about the size of the effect, its replicability, and its
likelihood of being real.

Thanks again for your many contributions.

Chris Chambers
Department of Psychology
Monash University
Clayton, Victoria 3168

Tel. +61 3 9905 3978
Fax. +61 3 9905 3948

EMAIL: chris.chambers@sci.monash.edu.au