# Re: st: stepwise

 From Timothy.Mak@iop.kcl.ac.uk To statalist@hsphsun2.harvard.edu Subject Re: st: stepwise Date Mon, 4 Sep 2006 15:19:14 +0100

```Hi Richard,

Do you know how the SPSS pairwise procedure work? I don't think it works
the way I wanted it to work.

Now I'm really curious about why you suggested using a lower cut-off than
.05. In fact I was going to use 0.15, as suggested in Hosmer and Lemeshow.
I thought the point of a stepwise regression is to mimic what would happen
if we could include every relevant regressor, so that the interpretation
of the regression coefficient is something like the effect of A assuming
everything else holds constant. I thought if we allow more predictors in,
then these other predictors would approximate the 'everything else' better
than if we only let the really significant ones in. If possible I would
put everything in the equations, but for two concerns: 1. Some predictors
simply do not go together - it doesn't make sense to say the effect of B
controlled for C, for whatever reason. 2. The problem of having too many
parameters in the asymptotic results of logistic regression.

Perhaps some more background would help. In essence, we're trying to
identify 'important' predictors of 'functional outcomes' of patients. We
have 30 predictors from childhood up to current. I know that considerable
literature exists for identifying 'important' covariates (eg the dominant
analysis by Budescu (1993 - Psychological Bulletin)). But since the use of
these techniques seems currently limited to specialist statistics journal,
I thought I would go by the more traditional way of using simple and
multiple regression. The pattern that I try to follow is from the
following paper: Bienvenu et al (2006) British Journal of Psychiatry
188:432. They do both simple and multiple logistic regression controlling
for all of their predictors. They have a larger sample than we do and
therefore they can fit 10-20 regressors in their regression. With our
smaller sample, there's no way we'll put all 30 predictors into our model.
My colleague ran a stepwise regression and it throws away about 40% of our
data, which I'm not so pleased about. And therefore I'd like to know how
PAIRWISE might help remedy the situation. Whether I use PAIRWISE or
LISTWISE, I think it would probably be better if I run the regression
again based on the selected variables over again.

You are right that at the end of the day we won't be able to put much
trust in the coefficients, but I feel it's still better than nothing.

Thanks,

Tim

Richard Williams <Richard.A.Williams.5@ND.edu>
Sent by: owner-statalist@hsphsun2.harvard.edu
04/09/2006 15:45
statalist@hsphsun2.harvard.edu

To
statalist@hsphsun2.harvard.edu
cc

Subject
Re: st: stepwise

At 04:30 AM 9/4/2006, Timothy.Mak@iop.kcl.ac.uk wrote:

>stepwise regression is needed. Say we have n = 200, and a potential pool
>of predictors = 50, say that each of these 50 predictors have 1 or 2
>missing, not necesarily randomly. Using the Stata stepwise procedure, we

Two other quick comments:  Since you are also using SPSS, you could
use pairwise deletion of missing data, which might be ok if the data
are missing randomly (but if nonrandom, you've got some problems
regardless of what you do).

Also, if you've got 50 predictors and are using the .05 level of
significance, then just by chance alone you'd expect 2 or 3 vars to
enter in.  I'd suggest using the .01 level of significance; or figure
out exactly what alpha level to use by doing, say, a Bonferroni

Overall though, I probably wouldn't feel very comfortable with SW in
this case, regardless of how I did it!  I'd certainly want to add
some cautionary notes in my writeup, perhaps labeling the analysis as
exploratory and in need of replication in other studies.

I wonder if some of those 50 vars couldn't be combined into a smaller
number of scales?  Do you really have 50 unique concepts here, or do
you maybe have several items that tap into the same concept in
slightly different ways?

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX:    (574)288-4373
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW (personal):    http://www.nd.edu/~rwilliam
WWW (department):    http://www.nd.edu/~soc

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```

• Follow-Ups: