Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Should we drop not-significant terms from models?

From   Joseph Coveney <[email protected]>
To   Statalist <[email protected]>
Subject   st: Should we drop not-significant terms from models?
Date   Wed, 18 Feb 2004 16:34:56 +0900

In the closing post to the thread st: Comparing change in rates -
frustrating problem: questionable results, Ricardo Ovaldia asked:

>One last thing, if the interaction term is not
>significant, does it still need to be included in the

Does anyone on the list have a reference to cite that provides guidance on
this matter?  My understanding is that there is disagreement among the

It might be helpful to distinguish circumstances (for example,
model-building and hypothesis-testing) in which the question could arise.
There could legitimately be different rules for each.

On one hand, in a model-building exercise, terms are deleted in the interest
of parsimony and generalizability of the final statistical model of the data
or phenomenon.  In this circumstance, terms could be deleted from the model
in accordance with a statistical criterion--for example a p-value
threshold--or a set of statistical and nonstatistical criteria.

On the other hand, in a hypothesis-testing setting, the statistical model is
constructed on the basis of content prior to having the data in-hand.  To
delete a term here would be to change nature of the hypothesis tested by the

In practice, the difference behind this simple-minded distinction might not
be clear-cut.  I have seen statisticians eschew dropping nonsignificant
terms from multiple regression models, invoking arguments akin to those
against stepwise regression, but then advocate the use of sequential (SAS
Type I/II) sums of squares in factorial ANOVA with interaction.

Joseph Coveney

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index