Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: newbie question: nonsig posthoc after sig anova

From   "Nick Cox" <>
To   <>
Subject   RE: st: newbie question: nonsig posthoc after sig anova
Date   Mon, 18 Aug 2003 19:02:42 +0100

Roger Newson

> An F-test comparing 5 groups is implicitly defining a 95%
> confidence
> region, in 4-dimensional "hyperspace", for 4 mean
> differences between the 4
> non-reference groups and the reference group. Jim's "barely
> significant"
> P-value implies that this confidence region does not
> contain the vector of
> 4 zero mean differences, which would be the true population mean
> differences if all 5 groups had the same mean. However, the
> post-hoc tests
> seem to imply that the confidence region for the 4 mean
> differences might
> contain zero values for any of the 4 mean differences individually.
> Therefore, although it seems that at least one difference
> is non-zero, Jim
> has insufficient data to incriminate one of these
> differences as being "the
> culprit".
> Jim doesn't say what the ANOVA is about. However, most
> statisticians
> nowadays, most of the time, prefer confidence intervals to
> P-values alone,
> because P-values only measure the compatibility of the data
> with zero
> population differences, and do not give a range of positive
> and/or negative
> and/or zero population differences with which the data ARE
> compatible. A
> good introduction to confidence intervals, commonly used in
> the medical
> sector, is Altman et al. (2000).
> Reference
> Douglas Altman, David Machin, Trevor Bryant, Martin
> Gardner. Statistics
> with Confidence. London: British Medical Journal Books;
> 2000. ISBN: 0727913751

I have a more general question arising obliquely
out of these issues.

The point of these multiple comparison procedures,
Bonferroni, Scheffe, Sidak, etc. (and sprinkle all
the accents required on those names) is, as I
understand it, to inject a strong note of
caution given the number of individual tests
you could carry out and the built-in tendency
that the more you carry out, the more are likely to attain
significance at some conventional level, and so forth.

What is the attitude to fishing _among_ multiple
comparison procedures, i.e. looking _among_ various
different post hoc results with the pitfall that
you're tempted to report the one closest to your
pre-conceived (ne)science?

Aren't you supposed to cleave the one whose
inferential logic you find most compelling?

Is this a documented issue?


*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index