[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: clustering in proportional hazards models with stata/mp 10.0 - conditional logistic

From	Ricardo Ovaldia <[email protected]>
To	[email protected]
Subject	Re: st: clustering in proportional hazards models with stata/mp 10.0 - conditional logistic
Date	Wed, 24 Oct 2007 05:47:12 -0700 (PDT)
Thank you Dr. Gould for a thorough and clear
explanation. 

I have a similar problem related to conditional
logistic regression. I have data from a multi-center
(7 clinics) study. I analyzed the data using
conditional logistic grouping on clinic. I was asked
to defend my method, because previous analyses on
these data were performed using indicator variables or
simply using a robust variance estimator. 

I am planning on using the explanation from Dr. Gould
post, however, the argument that I would use for
conditional logistic is the same as that presented for
the indicator variables (dummies) . So I am missing
something, what is the difference? By the way, the
results I obtained using conditional logistic and
dummies are very similar.

Thank you,
Ricardo




--- "William Gould, StataCorp LP" <[email protected]>
wrote:

> Daniel Koralek <[email protected]> writes about using
> -stcox- on individual
> data where each individual was recruited from one of
> ten centers.  He is
> concerned that which center may influence survival
> because "different foods
> eaten in different regions may influence nutrients".
> 
> He considers three ways of dealing with this
> problem,
> 
>        . stcox ..., vce(cluster center)             
>   (1)
> 
>        . xi:  stcox ... i.center                    
>   (2)
> 
>        . stcox ..., stratify(center)                
>   (3)
> 
> and, of course, he could ignore center altogether
> 
>        . stcox ... [center completely omitted]      
>   (0)
> 
> As a matter of notation, let's assume the other
> covariates in the 
> models (the ... part) are x1 and x2.
> 
> My comments are as follows:
> 
> Re solution (0):
> 
>      This solution assumes center has no effect and
> Daniel has already
>      raised concerns that it does, so the solution
> is inappropriate.
> 
> Re solution (1):
> 
>      This solution also assumes center has no
> effect; it instead 
>      conservatively handles the situation where the
> individual patients
>      are overly homogeneous, which is to say, not
> independent draws.
>      Actually, I didn't say that exactly right for
> the Cox model, but
>      what I said implies what what I should have
> said, which is that
>      selection of the failures from the risk pools
> at each failure time 
>      are not independent.
> 
>      Daniel tried solution (1) and found that the
> standard errors changed, 
>      but the reported coefficients did not. 
> Exactly.  Under solution (1),
>      because center has no effect, the coefficients
> estimated the standard
>      way are fine, although perhaps inefficient. 
> The lack of independence,
>      however, means standard errors usually will be
> understated and
>      -vce(cluster center)- handles that.
> 
> Re solution (2):
> 
>      This solution assumes that center does have a
> direct effect on 
>      survival, and it constrains the effect to be a
> multiplicative 
>      shift in the the baseline hazard function.  The
> baseline hazard 
>      function ho(t) is a function of time, such as
> 
>             ho(t)
>               |             .
>               | .         .   .
>               |. .       .
>               |   .    .
>               |     . .
>               |
>               +-------------------  time
> 
>       FYI, the baseline survival function So(t) is
> the integral of 
>       ho(t), negated and exponentiated.  There's
> nothing deep there; 
>       that's just the mathematical formula for
> calculating one one 
>       from the other.  I switchd to hazard
> functions, however, 
>       because the hazard function is the natural
> metric for the Cox model.
>       The hazard rate for a particular individual in
> the data at a particular
>       time is just ho(t)*exp(X_i*b), where X_i are
> the individual's covariates
>       at time t.  That's why I said solution (2)
> constrains each center's
>       effect to be a multiplicative shift of ho(t).
> 
>       Concerning our use of dummy variables for the
> centers, 
>       we would like to think that we chose this
> particular functional form
>       because it is truly representative of how the
> different 
>       foods served in the different centers
> influence the hazard, but 
>       the fact is that we choose this functional
> form because it is 
>       convenient; the effect of each center is
> wrapped up in just a 
>       single coefficient.
> 
>       This is not a bad approach.  
> 
> Re solution (2.5):
> 
>       Alright, I admit that Daniel did not include a
> solution (2.5), but 
>       I want to add it; it will help to understand
> solution (2), and 
>       is often useful in and of itself.
> 
>       Solution (2) was 
> 
>        . xi:  stcox ... i.center                    
>   (2)
> 
>       Solution 2.5 is 
> 
>        . xi:  stcox ... i.center i.center*x1        
>   (2.5)
> 
>       In this solution, we assume that center does
> not merely shift 
>       the hazard function in a multiplicative way,
> we assume that 
>       center modifies the effect of x1.
> 
>       Actually, there are a lot of solution (2.5)'s.
>  I could have chosen 
>       x2 rather than x1, 
>       
>        . xi:  stcox ... i.center i.center*x2
> 
>       or even x1 and x2, 
> 
>        . xi:  stcox ... i.center i.center*x1
> i.center*x2
> 
>      Anyway, in this modeling-based approach, we
> need to think carefully 
>      about how the different foods served in the
> centers effects the shifting 
>      of the baseline hazard function.  Is it just a
> shift (solution 2), 
>      or do the different foods modify the effect x1
> (solution 2.5), or 
>      something else?
> 
>      We also need to appreciate that we are assuming
> the SHAPE of the 
>      survivor function is the same across all
> centers and that we are 
>      just moving it up and down, multiplicatively.
> 
> 
> Re solution (3):
> 
>      In this solution, we let the baseline hazard be
> different for each 
>      center.  That is, rather than assuming the
> baseline function is 
> 
>             ho(t)
>               |             .
>               | .         .   .
>               |. .       .
>               |   .    .
>               |     . .
>               |
>               +-------------------  time
> 
>       for all centers, albeit shifted, we assume
> that above picture might 
>       be the baseline function for center 1, and for
> center 2, the function 
>       could be completely different:
> 
>             ho(t)
>               |    . . . 
>               |   .     .
>               |. .       .
>               | .         .
>               |            . . .
>               |
>               +-------------------  time
> 
=== message truncated ===


Ricardo Ovaldia, MS
Statistician 
Oklahoma City, OK

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- Re: st: conditional logistic
  - From: Ricardo Ovaldia <[email protected]>
Prev by Date: Re: st: Re: doeditor run and do icons
Next by Date: Re: Re: st: CI for adjusted mean
Previous by thread: st: Re: creating mean across panel variables
Next by thread: Re: st: conditional logistic
Index(es):
- Date
- Thread