Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Stratify analysis - logistic regression with dummies


From   "Visintainer, Paul" <PAUL_VISINTAINER@NYMC.EDU>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Stratify analysis - logistic regression with dummies
Date   Thu, 5 Jun 2008 14:26:11 -0400

Since race is a single categorical variable at 3 levels, the acceptable
approach is to create your model on the total sample, with all
categories of race represented.  Don't restrict your analysis to
subgroups, unless you have an a priori question or you've tested and
found significant interactions with race (e.g., race by age
interaction).  This approach will preserve the integrity of the
statistical inquiry.

-p



______________________________________
Paul F. Visintainer, PhD
Department of Epidemiology and Biostatistics
School of Public Health
New York Medical College
PH: (914) 594-4804
FX: (914) 594-4853
 

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Ricardo
Ovaldia
Sent: Thursday, June 05, 2008 2:04 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: Stratify analysis - logistic regression with
dummies

Thank you Paul. That makes perfect sense. However, the quetion of which
OR is best to report remains, especially if the means of the continuous
variable differ for ecah level of the class variable.

 

Ricardo Ovaldia, MS
Statistician 
Oklahoma City, OK


--- On Thu, 6/5/08, Visintainer, Paul <PAUL_VISINTAINER@NYMC.EDU> wrote:

> From: Visintainer, Paul <PAUL_VISINTAINER@NYMC.EDU>
> Subject: st: RE: Stratify analysis - logistic regression with dummies
> To: statalist@hsphsun2.harvard.edu
> Date: Thursday, June 5, 2008, 9:49 AM
> Ricardo,
> 
> The difference is probably due to the fact that you are
> developing your
> models on different samples sizes, and as a consequence, a
> different
> mean age for each sample.  This isn't a problem when
> you are computing
> an unadjusted OR for a categorical variable.  (Compare your
> unadjusted
> -logit- command with the equivalent -tabodds- command.  In
> your example,
> your first logit command -xi: logistic low i.race- is
> computing the ORs
> for a 3x2 table.  You can replicate the logit command by
> running
> -tabodds low race, or-).
> 
> When you add age as a continuous variable to your model AND
> use the "if"
> statement, your model is alternatively excluding
> observations who are
> either RACE2 or RACE3, thus your sample size changes (e.g,
> n=122 or
> n=163).  The adjustment for age is based on the mean age
> for sample
> being used to estimate the OR.  Thus, as you change the
> samples change
> so does the mean age:
> 
> For n=189: mean age== 23.2381
> For n=163: mean age== 23.5092  (no RACE2)
> For n=122: mean age== 23.7049 (no RACE3)
> 
> -p
> 
> 
> ______________________________________
> Paul F. Visintainer, PhD
> Department of Epidemiology and Biostatistics
> School of Public Health
> New York Medical College
> PH: (914) 594-4804
> FX: (914) 594-4853
>  
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of
> Ricardo
> Ovaldia
> Sent: Wednesday, June 04, 2008 10:15 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: Stratify analysis - logistic regression with
> dummies
> 
> I am confused by some of the result that I got. I will
> illustrate using
> Hosmer & Lemeshow' low weight data:
> 
> . use http://www.stata-press.com/data/r10/lbw.dta
> (Hosmer & Lemeshow data)
> 
> if I fit
> 
> . xi:logistic low i.race
> 
> and then fit
> 
> . xi:logistic low i.race if race==1 | race==2
> 
> and
> 
> . xi:logistic low i.race if race==1 | race==3
> 
> I get the same OR for  _Irace_2  and _Irace_3 as I do for
> the full
> model. This is as expected because the dummies are
> ortogonal to each
> other.
> 
> However, when a covariate is added to the model, the same
> is not true
> anymore:
> 
>  
> . xi:logistic low i.race age
> 
>          low | Odds Ratio   Std. Err.      z    P>|z| 
> [95% Conf.
> Interval]
>
-------------+----------------------------------------------------------
> ---
>     _Irace_2 |   2.106974   .9932407     1.58   0.114  
> .8363679
> 5.307878
>     _Irace_3 |   1.767748   .6229325     1.62   0.106  
> .8860686
> 3.526738
>          age |   .9612592   .0311206    -1.22   0.222  
> .9021588
> 1.024231
>
------------------------------------------------------------------------
> ---
> 
> . xi:logistic low i.race age if race==1 | race==2
> 
>
------------------------------------------------------------------------
> ---
>          low | Odds Ratio   Std. Err.      z    P>|z| 
> [95% Conf.
> Interval]
>
-------------+----------------------------------------------------------
> ---
>     _Irace_2 |   2.155207   1.021287     1.62   0.105   
> .8513944
> 5.45566
>          age |   .9705512   .0376446    -0.77   0.441   
> .8995039
> 1.04721
>
------------------------------------------------------------------------
> ---
> 
> . xi:logistic low i.race age if race==1 | race==3
> 
>
------------------------------------------------------------------------
> ---
>          low | Odds Ratio   Std. Err.      z    P>|z| 
> [95% Conf.
> Interval]
>
-------------+----------------------------------------------------------
> ---
>     _Irace_3 |   1.724551   .6098827     1.54   0.123   
> .8622856
> 3.449063
>          age |   .9440875   .0340586    -1.59   0.111   
> .8796392
> 1.013258
>
------------------------------------------------------------------------
> ---
> 
> 
> There is no missing data.
> 
> 
> I am very confused about which OR to reports and what are
> the
> differences in these models. I was not expecting these
> results.
> 
> Thank you in advance,
> Ricardo.
> 
> 
> Ricardo Ovaldia, MS
> Statistician 
> Oklahoma City, OK
> 
> 
> 
> 
>       
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


      
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index