Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

re: st: Scheffe, Bonferroni and Sidak tests

From   David Airey <>
Subject   re: st: Scheffe, Bonferroni and Sidak tests
Date   Mon, 29 Dec 2003 10:32:20 -0600

In an ANOVA context, I like the Chapter 5 discussions in the text by Maxwell and Delaney, "Designing experiments and analyzing data: A model comparison perspective", 2004.

On this list, Roger Newson has provided advice and software to deal with multiple comparisons adjustments (findit smileplot). His personal site also has good stuff.

Different (posthoc) multiple comparisons procedures are used depending on what you are comparing. All pairs? Finding the best treatment? Comparing against a control group? You can achieve more power using the right test in the right context.

ANOVA in Stata could be improved to at least provide Hsu's, Dunnett's, Tukey's, Scheffe's, and Bonferroni'e methods for multiple comparisons following any ANOVA. No doubt these procedures can be performed with a little programming from the bits and pieces left in memory following ANOVA in Stata, but they currently are not easily available to the non-programming user.

In psychology, genetics, and neuroscience, multiple test adjustments are taken seriously. I'm surprised to hear you saying this from a sociology background?


The -oneway- program gives you the option of doing Scheffe, Bonferroni and Sidak tests. Some other commands offer similar options, e.g. -test-. Is there any consensus as to which is best? Are there any situations in which one is clearly preferable to another? My experience has been that results tend to be similar, and the examples from the Stata reference guide also show little difference.

Also, as a practical matter, how often do these tests get done in practice? I see them rarely in the stuff I read and no reviewer has ever asked me to provide this info. I understand the rationale for these tests, but if you take that rationale to its logical conclusion it seems like every coefficient in every model should have its p-value adjusted to reflect the fact that multiple tests are going on. The tests also seem like they can be excessively conservative -- all differences between pairs of groups could be significant at the .05 level, but after making one of these adjustments none of the differences could be significant.
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index