[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] stats (mis)use in psychology and hearing science



Dear list,

> From:    Pierre Divenyi <pdivenyi@xxxxxxxxxxxxxxxxxx>
> Subject: Re: stats (mis)use in psychology and hearing science
> 
> That may be correct under certain circumstances but the real problem is
> ascertaining that ANOVA is, indeed, appropriate. And it is truly a "REAL"
> problem!
> 
> On 9/22/13 11:15 PM, "Kyle Nakamoto" <knakamoto@xxxxxxxxxx> wrote:
> 
> >Nonparametrics are not automatically better. If you use a nonparametric
> >statistic when an ANOVA is appropriate the chance of missing a real effect
> >increases (False Negative).


It is of course true that we should all pay attention to the assumptions of statistical tests. For example, one should not use tests assuming independent observations when analyzing data from a repeated-measures design. Another example, as I mentioned in a previous posting, is that repeated-measures ANOVAs are sensitive to departures from normality, and many additional problems arise when the design is unbalanced (i.e., unequal group sizes, see (Keselman, Algina, & Kowalchuk, 2001)).
However, it is a sort of "magical" thinking that nonparametric methods are *free* of assumptions - this is of course not the case.

Just to give you two simple examples:

1) The nonparametric Friedman rank test that can be used for analyzing data from a one-factorial repeated-measures design assumes equal variances and covariances of the measures. In real data sets, this assumption is almost always violated, known as a deviation from sphericity. That's why we (hopefully) use for example the Huynh-Feldt correction for the degrees-of-freedom when computing a repeated-measures ANOVA with the univariate approach. Simulation studies showed that when the (co-)variances are heterogeneous, Friedman's test does *not* control the Type I error rate (i.e., probability to produce a significant p-value (e.g., p < .05) when the population means are *identical*), especially when the distribution of the response measure is skewed (Harwell & Serlin, 1994; St. Laurent & Turk, 2013).

2) The (nonparametric) Mann-Whitney U-test that can be used to compare the means of two independent samples does not control the Type I error rate in the case of variance heterogeneity (i.e., the two groups have different variances), especially when the distribution of the response measure is asymmetric (skewed). In fact, the U-test was shown to be *more* sensitive to violations of these assumption than the classical t-test in several conditions (Stonehouse & Forrester, 1998).

Thus, it is not a given that "nonparametric" methods are more "robust" (or even "free of assumptions") than parametric methods like the GLM. Instead, if you have reason to believe that the assumptions for a parametric test are violated, then it will be a good idea to consult the (sometimes very extensive, sometimes very sparse) literature on simulation studies concerning the effects of such violations - sometimes it might turn out that using the "bad" good old t-test or another parametric method is in fact superior to applying a nonparametric procedure. But sometimes the answer will likely neither be "parametric better" or "nonparametric better", but "it's complicated" (or: "we don't know yet")...

I would be interested to learn which data analysis problems you are facing in your research -- maybe we could use this list to identify the best solutions to these problems?

Best,

Daniel

Harwell, M. R., & Serlin, R. C. (1994). A Monte-Carlo study of the Friedman test and some competitors in the single factor, repeated-measures design with unequal covariances. Computational Statistics & Data Analysis, 17(1), 35-49.
Keselman, H. J., Algina, J., & Kowalchuk, R. K. (2001). The analysis of repeated measures designs: A review. British Journal of Mathematical and Statistical Psychology, 54, 1-20.
St. Laurent, R., & Turk, P. (2013). The effects of misconceptions on the properties of Friedman's test. Communications in Statistics-Simulation and Computation, 42(7), 1596-1615.
Stonehouse, J. M., & Forrester, G. J. (1998). Robustness of the t and U tests under combined assumption violations. Journal of Applied Statistics, 25(1), 63-74.




PD Dr. Daniel Oberfeld-Twistel
Johannes Gutenberg - Universitaet Mainz
Department of Psychology
Experimental Psychology
Wallstrasse 3
55122 Mainz
Germany

Phone ++49 (0) 6131 39 39274 
Fax   ++49 (0) 6131 39 39268
http://www.staff.uni-mainz.de/oberfeld/

http://www.facebook.com/daniel.oberfeldtwistel