# Re: st: RE: Stratify analysis - logistic regression with dummies

 From Ricardo Ovaldia To statalist@hsphsun2.harvard.edu Subject Re: st: RE: Stratify analysis - logistic regression with dummies Date Thu, 5 Jun 2008 11:03:36 -0700 (PDT)

```Thank you Paul. That makes perfect sense. However, the quetion of which OR is best to report remains, especially if the means of the continuous variable differ for ecah level of the class variable.

Ricardo Ovaldia, MS
Statistician
Oklahoma City, OK

--- On Thu, 6/5/08, Visintainer, Paul <PAUL_VISINTAINER@NYMC.EDU> wrote:

> From: Visintainer, Paul <PAUL_VISINTAINER@NYMC.EDU>
> Subject: st: RE: Stratify analysis - logistic regression with dummies
> To: statalist@hsphsun2.harvard.edu
> Date: Thursday, June 5, 2008, 9:49 AM
> Ricardo,
>
> The difference is probably due to the fact that you are
> developing your
> models on different samples sizes, and as a consequence, a
> different
> mean age for each sample.  This isn't a problem when
> you are computing
> an unadjusted OR for a categorical variable.  (Compare your
> -logit- command with the equivalent -tabodds- command.  In
> your first logit command -xi: logistic low i.race- is
> computing the ORs
> for a 3x2 table.  You can replicate the logit command by
> running
> -tabodds low race, or-).
>
> When you add age as a continuous variable to your model AND
> use the "if"
> statement, your model is alternatively excluding
> observations who are
> either RACE2 or RACE3, thus your sample size changes (e.g,
> n=122 or
> n=163).  The adjustment for age is based on the mean age
> for sample
> being used to estimate the OR.  Thus, as you change the
> samples change
> so does the mean age:
>
> For n=189: mean age== 23.2381
> For n=163: mean age== 23.5092  (no RACE2)
> For n=122: mean age== 23.7049 (no RACE3)
>
> -p
>
>
> ______________________________________
> Paul F. Visintainer, PhD
> Department of Epidemiology and Biostatistics
> School of Public Health
> New York Medical College
> PH: (914) 594-4804
> FX: (914) 594-4853
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of
> Ricardo
> Ovaldia
> Sent: Wednesday, June 04, 2008 10:15 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: Stratify analysis - logistic regression with
> dummies
>
> I am confused by some of the result that I got. I will
> illustrate using
> Hosmer & Lemeshow' low weight data:
>
> . use http://www.stata-press.com/data/r10/lbw.dta
> (Hosmer & Lemeshow data)
>
> if I fit
>
> . xi:logistic low i.race
>
> and then fit
>
> . xi:logistic low i.race if race==1 | race==2
>
> and
>
> . xi:logistic low i.race if race==1 | race==3
>
> I get the same OR for  _Irace_2  and _Irace_3 as I do for
> the full
> model. This is as expected because the dummies are
> ortogonal to each
> other.
>
> However, when a covariate is added to the model, the same
> is not true
> anymore:
>
>
> . xi:logistic low i.race age
>
>          low | Odds Ratio   Std. Err.      z    P>|z|
> [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ---
>     _Irace_2 |   2.106974   .9932407     1.58   0.114
> .8363679
> 5.307878
>     _Irace_3 |   1.767748   .6229325     1.62   0.106
> .8860686
> 3.526738
>          age |   .9612592   .0311206    -1.22   0.222
> .9021588
> 1.024231
> ------------------------------------------------------------------------
> ---
>
> . xi:logistic low i.race age if race==1 | race==2
>
> ------------------------------------------------------------------------
> ---
>          low | Odds Ratio   Std. Err.      z    P>|z|
> [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ---
>     _Irace_2 |   2.155207   1.021287     1.62   0.105
> .8513944
> 5.45566
>          age |   .9705512   .0376446    -0.77   0.441
> .8995039
> 1.04721
> ------------------------------------------------------------------------
> ---
>
> . xi:logistic low i.race age if race==1 | race==3
>
> ------------------------------------------------------------------------
> ---
>          low | Odds Ratio   Std. Err.      z    P>|z|
> [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ---
>     _Irace_3 |   1.724551   .6098827     1.54   0.123
> .8622856
> 3.449063
>          age |   .9440875   .0340586    -1.59   0.111
> .8796392
> 1.013258
> ------------------------------------------------------------------------
> ---
>
>
> There is no missing data.
>
>
> I am very confused about which OR to reports and what are
> the
> differences in these models. I was not expecting these
> results.
>
> Ricardo.
>
>
> Ricardo Ovaldia, MS
> Statistician
> Oklahoma City, OK
>
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```