Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: restricting margins to significant variables only


From   Richard Williams <[email protected]>
To   [email protected], [email protected]
Subject   RE: st: restricting margins to significant variables only
Date   Fri, 18 Mar 2011 15:32:15 -0500

At 12:21 PM 3/18/2011, Maarten buis wrote:
--- On Fri, 18/3/11, Richard Williams wrote:
> I agree, and one of the things that has always troubled me
> is the view that diagnostic tests (and resulting model
> modifications) are good while stepwise regression is bad.

Doing model diagnostics and testing hypotheses require a
different logic. Model diagnostics is all about a trade-off
between making the model simple enough so we understand the
results and complicated enough so that is close enough to
reality. Statistical tests is all about the trade-off between
probability rejecting a hypothesis when we should not and
rejecting a hypothesis when we should. So, by performing a
test at the model diagnostic stage one is applying an
inappropriate logic for that decision.

Suppose, however, that based on diagnostic tests/visual inspections we decide to add or transform variables in the model, e.g. we add X^2, use ln(X) instead of X, or include an interaction term for gender*income. These added/transformed variables have a pretty good chance of being statistically significant, even though we may just be capitalizing on chance features of the data. That doesn't mean we shouldn't do diagnostic tests/visual inspections, but we should realize that the P values for the final model may be deceptively good and the results make us look more like "geniuses" than we really are. Ergo, I agree with Nick when he says "those who want to select their models in the light of the data and then write them up keeping exactly the same view of P-values as if the model published is exactly the model first thought of are playing a rather strange game." (Even though I may have played that game myself!)

If you are going to use stepwise selection, it is sometimes suggested that you should use more stringent P values or verify the results on a 2nd data set. The same advice might be good when diagnostic tests/visual inspections have significantly influenced the form of the final model.


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index