[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: time efficient way to choose variables

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	RE: st: time efficient way to choose variables
Date	Wed, 4 Feb 2009 18:12:45 -0000

I'd add another reference. I am currently looking at a more recent book
by Izenman. All the details are in 

<http://www.springer.com/statistics/statistical+theory+and+methods/book/
978-0-387-78188-4> 

The same website promises a second edition of Hastie et al. for next
month! 

I think Jay is right. There is not much by way of implementation of
these methods in Stata.  

Nick 
[email protected] 

jverkuilen

As others have noted, this is a variant of the long discredited stepwise
regression. 

There are better automatic variable selection procedures developed by
the machine learning people that go under colorful names like bagging
and boosting. These all use some kind of cross-validation or
bootstrapping to protect against capitalization on chance that older
stepwise procedures are very susceptible to. I don't think they are
implemented in Stata, but maybe someone has. See, e.g., T Hastie, R
Tibshirani, J Friedman. 2000. Elements of statistical learning.
Springer. 

Model averaging is another approach. This pools predictions from models
using weights derived from goodness of fit measures, again protecting
against capitalization on chance by using bootstrapping of some sort.
See, e.g., KA Burnham and D Anderson. 2003. Model selection and
multimodel inference, 2nd Ed. Springer. 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- RE: st: time efficient way to choose variables
  - From: jverkuilen <[email protected]>

Prev by Date: st: RE: RE: 'sneop' applicable to panel data?
Next by Date: st: FAQ visible on user-written ados and Stata version
Previous by thread: RE: st: time efficient way to choose variables
Next by thread: RE: st: time efficient way to choose variables
Index(es):
- Date
- Thread