Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Controlling for sample selection in three stages including a count based model


From   federico.tedeschi@univr.it
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Controlling for sample selection in three stages including a count based model
Date   Wed, 27 Apr 2011 13:27:47 +0200

I can't reply on your question related to selection bias because I don't understand what the source of bias is, and where the comparison between 2 (or more) groups stands.
I can tell you about my variable selection procedure, even if I think many people here could suggest a better one. I use the command "boost". A Multiple Additive Regression Tree (see Schonlau, 2005, for Stata implementation) is performed. I try different values of the number of nodes and the smoothing parameter. I take the model with the best fit on the test set. Then, I see the weights of each variable (the fraction of times is used as the splitting variable for the nodes) and use it to drive my stepwise regression (each time a variable has to be inserted, I try starting with the one with the highest weight; each time a variable has to be dropped, I test from the one with lowest weight). I don't know if it is a good procedure, however, but I think the simple stepwise procedure is not very trustworthy...
Federico

Schonalu, M., 2005, Boosted regression (boosting): An introductory tutorial and a Stata plugin, The Stata Journal 5, Number 3, pp. 330–354.
 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index