Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: stepwise (with multiple imputation)

From   "Feiveson, Alan H. (JSC-SK311)" <[email protected]>
To   <[email protected]>
Subject   RE: st: stepwise (with multiple imputation)
Date   Fri, 1 Sep 2006 11:58:05 -0500

Once you have the model, then you do the regression with only the "real"
data. If you come up with a different "best" model each time you do the
imputation, then probably any of these "best" ones will do. The
difference between them is noise.

Al F.

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of
[email protected]
Sent: Friday, September 01, 2006 10:15 AM
To: [email protected]
Subject: Re: st: stepwise (

Hi Richard, 

I specified 'Forward' selection in both SPSS and Stata, and my
understanding of 'Forward' is: 

You start with the empty model, ie with just the constant, then you add
each one in turn, then you examine the p values. You include in the
model the one with the lowest p value if it also is within the criteria,
say p<.15. Then you repeat again, retaining that variable, until that
your variable with the lowest p value is no longer less than .15. This
procedure I learnt from Hosmer and Lemeshow: Applied Logistic

I don't see why you can't work with all available data at each try. 
Arguably there is the down side that you're comparing models with
different number of observations. But it just bothers me that at the end
of the day I have a final model that doesn't have the same results as
when I simply enter the variables. Moreover, if there are lots of
variables, we may end up running the procedure on only half of the data,
which is a bit stupid I think. Alan, thanks for the suggestion of
multiple imputation, but that's not my concern at the moment, because I
won't be using it simply because it's too complicated. In any case, how
do you run stepwise regression on several different imputed datasets and
decide on one final one at the end? 


Richard Williams <[email protected]> Sent by:
[email protected]
01/09/2006 16:49
Please respond to
[email protected]

[email protected]

Re: st: stepwise

At 09:28 AM 9/1/2006, [email protected] wrote:
>Hi Stata list,
>When it come to stepwise regression, both SPSS and Stata do something I
>don't know why it does. Given the made-up dataset below, where y has 8
>observations and x1 has 8 and x2 has 7.
>    y      x1    x2
>     4.00            1.00             .00
>     5.00            1.00             .00
>     6.00            1.00             .00
>     7.00            1.00            1.00
>     8.00             .00             .00
>     9.00             .00             .00
>    10.00             .00        .
>    11.00             .00             .00
>If I run a stepwise regression of y on x1 and x2, using the Forward
>procedure, then the final 'selected' model, which is equivalent to the
>on x1 regression, only uses 7 observations. Is it based on any 
>principle that models should be selected this way? If not, why does
>not at least provide an option where you can use all available
>observations in the selection process?

I believe both SPSS and Stata start by doing listwise deletion of
MD.  When choosing a model, the comparisons would get distorted if
different cases were being analyzed at different steps, i.e. you
shouldn't compare a model with 8 cases and x1 to a model with 7 cases
and X1 and X2.

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX:    (574)288-4373
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW (personal):
WWW (department):

*   For searches and help try:

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index