Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Stuck in a logistic regression. Roadmap

 From daniel klein To statalist@hsphsun2.harvard.edu Subject Re: st: Stuck in a logistic regression. Roadmap Date Sat, 1 Dec 2012 10:48:23 +0100

```Pedro,

regarding stepwise regresison techniques, see:
http://www.stata.com/support/faqs/statistics/stepwise-regression-problems/.

If you decide to stick with (backward) stepwise regerssion anyway,
have a look at -help stepwise-. You might also want to have a look
into Landsey and Sheather (2010), even though the authors restrict
their discussion on liner models.

There are different ways of doing this, but I think the most basic way
is to use e(sample). Stata stores the estimation sample in this
"variable" after any estimation. Think of e(sample) as an indicator
(dummy) variable, where 1 indicates an observartion has been used in
the previous estimation. If you want to run two models, you need to
run the "full" model first, then restric the sample in the second
model using an -if- qualifier.

Here is a short (nonsense) example

sysuse auto ,clear

// create some missing values in price
replace price = . in 1/23

// note that price now has 23 missing values
su foreign price mpg

// run the "full" model
logit foreign price mpg

// note that only 51 out of the 74 observations are used in the model

// now run the "reducded" model
logit foreign mpg if e(sample)

// Stata uses the same 51 observations, indicated by e(sample)
// -if e(sample)- is just the short way of typing -if (e(sample) == 1)

// if you need this specific sample another time,
// but want to run other models, you can "copy"
// the e(sample) variable to your dataset
g byte my_sample = e(sample)
ta my_sample

Best
Daniel

Lindsey, Charles, Sheater, Simon (2010). Variable selection in linear
regression. The Stata Journal, 10(4), 650-669.

--
Dear Statalist,

I am just arrived to Stata in the last month. Even thought I find it
easier and more flexible than my previous software for standard
statistics, I am stuck performing a logistic regression because I find
the style is very different from SPSS.  I would like to ask you some
questions:

1) In the selecting variables phase, I performed lrtest of constant
model and the model with the variable I try to test. If the number of
observations are different the lrtest is not valid. What method do you
recommend in this case?

2)I used SPSS where I did backstep logistic regression based on the
LR. Can I perform this kind of analysis in Stata? Is the stepwise a
recommended method to perform this kind of regression?

3) I used a macro called AllSetsReg in SPSS, in which I could obtain
the best subsets based on Cp Mallows and AIC. I know there are some
packages to do that in Stata, but I have more than 6 variables. Do you
know any package or method to do that?

4) I am follow the Hosmer-Lemeshow way of performing the regression,
but I don´t know if that´s the best way to do it. Is there a better
way to perform a log regression? Do you have any suggestion or any
roadmap to model which works for you.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```