Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Cox regression using a shared frailty model in multiply imputed data


From   Justin Schaffer <[email protected]>
To   [email protected]
Subject   Re: st: Cox regression using a shared frailty model in multiply imputed data
Date   Wed, 12 Feb 2014 05:06:09 -0800

Follow-up to my earlier question regarding using a Cox model with
shared frailty in multiply imputed data. Interestingly, other
extensions of the Cox model such as using time varying covariates
behave as expected in multiply imputed data (e.g. Stata's output
provides a separate estimation of the hazard ratio of the time varying
covariate after running the Cox regression). Below I show output of a
Cox model including both the shared frailty "random effect" variable
and a time varying covariate run separately in my original dataset as
well as in my multiply imputed dataset. We see the hazard ratio of the
tvc is provided, but again, no output regarding theta or the
likelihood-ratio test of theta=0 (estimations of the shared frailty
variable's random effect) is provided in the output from our command
on multiply-imputed data (although theta is provided in the output
from running the command on the non-imputed dataset). Any explanation
as to why Stata does not provide theta and the likelihood-ratio test
of theta=0 in the imputed command of a shared frailty Cox regression
would be GREATLY appreciated.

Example:

stset daysAlive, fail(dead)
sts generate nelsonAalen = na
mi set wide model01
mi register regular age hosp_id timeVaryCovar nelsonAalen dead
mi register impute imputed_variable
mi impute chained (pmm) imputed_variable = age hosp_id timeVaryCovar
dead nelsonAalen, add(20) augment chaindots burnin(10) rseed(1234)

mi stset daysAlive, fail(dead)

//Cox model with shared frailty and timeVarying on imputed dataset
mi estimate, hr dots: stcox age imputed_variable timeVaryCovar,
shared(hosp_id) tvc(timeVaryCovar) texp(ln(_t))
> Imputations (20):
>  .........10.........20 done
>
> Multiple-imputation estimates                     Imputations     =         20
> Cox regression: Breslow method for ties           Number of obs   =       3174
>                                                  Average RVI     =     0.0211
>                                                  Largest FMI     =     0.1433
> DF adjustment:   Large sample                     DF:     min     =     949.62
>                                                          avg     =   1.08e+08
>                                                          max     =   1.42e+09
> Model F test:       Equal FMI                     F(  15,651305.5)=       7.18
> Within VCE type:          OIM                     Prob > F        =     0.0000
>
> ----------------------------------------------------------------------------------------------
>                          _t | Haz. Ratio   Std. Err.      t    P>|t|     [95% Conf. Interval]
> -----------------------------+----------------------------------------------------------------
> main                         |
>                         age |   1.026644   .0055998     4.82   0.000     1.015727    1.037678
>              imputed_variable |   1.222623   .0921528     2.67   0.008     1.054516    1.417528
>               timeVaryCovar |   1.052671   .2384387     0.23   0.821     .6752872    1.640956
> -----------------------------+----------------------------------------------------------------
> tvc                          |
>               timeVaryCovar |  -.0232224   .0373115    -0.62   0.534    -.0963516    .0499069
> ----------------------------------------------------------------------------------------------

//Cox model with shared frailty and timeVarying on imputed dataset
stcox age imputed_variable timeVaryCovar, shared(hosp_id)
tvc(timeVaryCovar) texp(ln(_t))
>          failure _d:  dead == 1
>   analysis time _t:  daysAlive
>
> Fitting comparison Cox model:
>
> Estimating frailty variance:
>
> Iteration 0:   log profile likelihood = -6299.5046
> Iteration 1:   log profile likelihood = -6299.5046  (backed up)
> Iteration 2:   log profile likelihood = -6299.4122
> Iteration 3:   log profile likelihood = -6299.3645
> Iteration 4:   log profile likelihood = -6299.3645
>
> Fitting final Cox model:
>
> Iteration 0:   log likelihood =  -6360.264
> Iteration 1:   log likelihood =  -6300.206
> Iteration 2:   log likelihood =  -6299.366
> Iteration 3:   log likelihood = -6299.3645
> Iteration 4:   log likelihood = -6299.3645
> Refining estimates:
> Iteration 0:   log likelihood = -6299.3645
>
> Cox regression --
>         Breslow method for ties                Number of obs      =      2590
>         Gamma shared frailty                   Number of groups   =        58
> Group variable: hosp_id
>
> No. of subjects =         2590                  Obs per group: min =         1
> No. of failures =          889                                 avg =  44.65517
> Time at risk    =      2367534                                 max =       278
>
>                                                Wald chi2(15)      =     97.50
> Log likelihood  =   -6299.3645                  Prob > chi2        =    0.0000
>
> ----------------------------------------------------------------------------------------------
>                          _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -----------------------------+----------------------------------------------------------------
> main                         |
>                         age |   1.026776   .0063275     4.29   0.000     1.014449    1.039253
>             imputed_variable |   1.280947   .1006414     3.15   0.002     1.098131    1.494199
>               timeVaryCovar |   .8734977   .2178749    -0.54   0.588     .5357327    1.424214
> -----------------------------+----------------------------------------------------------------
> tvc                          |
>               timeVaryCovar |   .0059519   .0413603     0.14   0.886    -.0751129    .0870166
> -----------------------------+----------------------------------------------------------------
>       theta |   .0274517   .0189878
> ------------------------------------------------------------------------------
> Likelihood-ratio test of theta=0: chibar2(01) =     4.29 Prob>=chibar2 = 0.019
>
> Note: standard errors of hazard ratios are conditional on theta.

Again, here we see theta estimated in the non-imputed command, but no
output from Stata regarding theta in the command run on the
multiply-imputed data. However, the command run on the
multiply-imputed data does give me a hazard ratio for a time-varying
covariate (a different extension of the Cox model). If anyone can tell
me why theta is not provided in the shared frailty Cox command on the
multiply imputed dataset and whether I am staistically "valid" in
estimating theta on my own by running the command separately on each
of the imputed datasets (and then combining using Rubin's rules) I
would really appreciate it.

Sincerely,

Justin

On Tue, Feb 11, 2014 at 5:53 AM, Justin Schaffer
<[email protected]> wrote:
> Hello, first time poster here (please excuse my ignorance).
>
> Stata (using Stata 13) will allow me to create a Cox regression model
> with shared frailty on a multiply imputed dataset. However, it does
> not give me an estimate of theta after running the command as it does
> when I run the regression model with shared frailty on non-imputed
> data. Can someone explain why this is, and whether I am violating some
> obscure law of statistics when I create a Cox regression model with
> shared frailty on my imputed dataset? I assume that estimating the
> standard error of theta is not statistically valid on an imputed
> dataset which is why the theta value is not shown, but to be honest, I
> am out of my league here. One option I'm considering is running the
> regression on each of my imputed data sets after creating a separate
> file for each imputed dataset (using mi set flongsep) and then
> averaging the theta values and estimating the standard error using
> Rubin's rules (although I have no clue if this is statistically
> valid). Notably, the variable that I regress on as well as the
> "random" variable of my shared frailty model need not be imputed (i.e.
> in the example below I run a Cox regression with shared frailty on the
> variable "age" with the variable "hosp_id" as my random effect
> variable, although neither "age" nor "hosp_id" were actually imputed
> in my dataset).
>
> Example:
>
> stset daysAlive, fail(dead)
> sts generate nelsonAalen = na
>
> mi set wide model01
> mi register regular age hosp_id nelsonAalen dead
> mi register impute imputed_variable
> mi impute chained (pmm) imputed_variable = age hosp_id dead
> nelsonAalen, add(20) augment chaindots burnin(10) rseed(1234)
>
> mi stset daysAlive, fail(dead==1)
>
> //Cox model with shared frailty on imputed dataset
> mi estimate, hr dots: stcox age, shared(hosp_id)
>> Imputations (20):
>>   .........10.........20 done
>>
>> Multiple-imputation estimates                     Imputations     =   20
>> Cox regression: Breslow method for ties           Number of obs   = 3174
>>                                                   Average RVI     = 0.0000
>>                                                   Largest FMI     = 0.0000
>> DF adjustment:   Large sample                     DF:     min     =    .
>>                                                           avg     =    .
>>                                                           max     =    .
>> Model F test:       Equal FMI                     F(   1,      .) =  31.39
>> Within VCE type:          OIM                     Prob > F        = 0.0000
>>
>> ------------------------------
> ------------------------------------------------
>>           _t | Haz. Ratio   Std. Err.      t    P>|t|     [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>>          age |   1.030194   .0054693     5.60   0.000      1.01953  1.040969
>> ------------------------------------------------------------------------------
>
> //Cox model with shared frailty on non-imputed dataset
> stcox age, shared(hosp_id)
>>          failure _d:  dead == 1
>>    analysis time _t:  daysAlive
>>
>> Fitting comparison Cox model:
>>
>> Estimating frailty variance:
>>
>> Iteration 0:   log profile likelihood = -8334.2065
>> Iteration 1:   log profile likelihood = -8334.2065  (backed up)
>> Iteration 2:   log profile likelihood = -8333.3181
>> Iteration 3:   log profile likelihood = -8333.2928
>> Iteration 4:   log profile likelihood = -8333.2927
>>
>> Fitting final Cox model:
>>
>> Iteration 0:   log likelihood = -8369.2282
>> Iteration 1:   log likelihood = -8333.5078
>> Iteration 2:   log likelihood = -8333.2927
>> Iteration 3:   log likelihood = -8333.2927
>> Refining estimates:
>> Iteration 0:   log likelihood = -8333.2927
>>
>> Cox regression --
>>          Breslow method for ties                Number of obs      = 3174
>>          Gamma shared frailty                   Number of groups=        60
>> Group variable: hosp_id
>>
>> No. of subjects =         3174                  Obs per group: min=         1
>> No. of failures =         1138                                 avg = 52.9
>> Time at risk    =      2951413                                 max=       283
>>
>>                                                 Wald chi2(1)       =31.39
>> Log likelihood  =   -8333.2927                  Prob > chi2        =0.0000
>>
>> ------------------------------------------------------------------------------
>>           _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf.Interval]
>> -------------+----------------------------------------------------------------
>>          age |   1.030194   .0054693     5.60   0.000      1.01953  1.040969
>> -------------+----------------------------------------------------------------
>>        theta |   .0391666   .0190069
>> ------------------------------------------------------------------------------
>> Likelihood-ratio test of theta=0: chibar2(01) =    11.50 Prob>=chibar2 =0.000
>>
>> Note: standard errors of hazard ratios are conditional on theta.
>
> Many thanks in advance for any statistical advice you gurus have to offer.
>
> Sincerely,
>
> JMS
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index