Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: restricting margins to significant variables only

From	Richard Williams <[email protected]>
To	[email protected], [email protected]
Subject	RE: st: restricting margins to significant variables only
Date	Fri, 18 Mar 2011 15:32:15 -0500

At 12:21 PM 3/18/2011, Maarten buis wrote:

--- On Fri, 18/3/11, Richard Williams wrote:
> I agree, and one of the things that has always troubled me
> is the view that diagnostic tests (and resulting model
> modifications) are good while stepwise regression is bad.

Doing model diagnostics and testing hypotheses require a
different logic. Model diagnostics is all about a trade-off
between making the model simple enough so we understand the
results and complicated enough so that is close enough to
reality. Statistical tests is all about the trade-off between
probability rejecting a hypothesis when we should not and
rejecting a hypothesis when we should. So, by performing a
test at the model diagnostic stage one is applying an
inappropriate logic for that decision.

Suppose, however, that based on diagnostic tests/visual inspectionswe decide to add or transform variables in the model, e.g. we addX^2, use ln(X) instead of X, or include an interaction term forgender*income. These added/transformed variables have a pretty goodchance of being statistically significant, even though we may just becapitalizing on chance features of the data. That doesn't mean weshouldn't do diagnostic tests/visual inspections, but we shouldrealize that the P values for the final model may be deceptively goodand the results make us look more like "geniuses" than we really are.Ergo, I agree with Nick when he says "those who want to select theirmodels in the light of the data and then write them up keepingexactly the same view of P-values as if the model published isexactly the model first thought of are playing a rather strangegame." (Even though I may have played that game myself!)

If you are going to use stepwise selection, it is sometimes suggestedthat you should use more stringent P values or verify the results ona 2nd data set. The same advice might be good when diagnostictests/visual inspections have significantly influenced the form ofthe final model.



-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- RE: st: restricting margins to significant variables only
  - From: Richard Williams <[email protected]>
- RE: st: restricting margins to significant variables only
  - From: Maarten buis <[email protected]>

Prev by Date: st: St: R-squared after running SGMM (xtabond2)
Next by Date: st: merge m:1 by string
Previous by thread: RE: st: restricting margins to significant variables only
Next by thread: st: Brant test interpretation with categorical variables
Index(es):
- Date
- Thread