Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: How to proceed a landmark survival analysis (tests and plots)?

 From Yuval Arbel To statalist@hsphsun2.harvard.edu Subject Re: st: How to proceed a landmark survival analysis (tests and plots)? Date Wed, 26 Oct 2011 17:19:18 +0200

```Hi Anna,

Fortunately, I recently completed a paper draft based on this
methodology , and I explored it thoroughly, so I might be able to help
you.I also apologize for the long e-mail and hope you can follow the
whole stages

In my opinion what you did so far is only the very elementary stage of
the research - which is more or less equivalent to presenting a
summary statistics of variables.

Assuming that you have only one kind of failure (the patient either
survives or dies), what you need to do next is to use the -stcox-
command in order to make a regression analysis. In other words: you
need to control for other variables, which might cause patients not to
survive in order to isolate the impact of therapy1 and therapy2. In
this context you should look stata manual for -stcox-. Also see the
example I give below

The final stage of the analysis (which is the most interesting in my
opinon) is to simulate how the survival rates will be affected by
modifying the dosage of the different treatments. Here you should look
at the manual for -postestimation after stcox- and the example I give
below.

Let me show you the following example (keep in mind, however, that I'm
working with stata 11.2):

Suppose lung cancer patients are exposed to saturated fats (lets call
this variable "mean_reduct") and smoking (lets call this variable
"max_red") during the sample period. Suppose further you were able to
measure the amounts each  patient were exposed to. The outcomes of
-stcox- command is the following:

. stcox mean_reduct max_red reductcurrent_max_reduct rent_net8
> tgage appreciation,nohr

failure _d:  fail == 1
analysis time _t:  time_index
id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74694.532
Iteration 2:   log likelihood = -74538.881
Iteration 3:   log likelihood = -74533.372
Iteration 4:   log likelihood = -74533.352
Iteration 5:   log likelihood = -74533.352
Refining estimates:
Iteration 0:   log likelihood = -74533.352

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
LR chi2(8)      =   7669.79
Log likelihood  =   -74533.352                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
_t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_reduct |   .0129407   .0006221    20.80   0.000     .0117214    .0141599
max_red |   .1713769    .020039     8.55   0.000     .1321011    .2106527
red~x_reduct |   .0223149   .0005149    43.34   0.000     .0213057     .023324
rent_net8 |   .0025795   .0001659    15.55   0.000     .0022543    .0029047
diff_stdma~a |  -.4692028   .0457971   -10.25   0.000    -.5589636   -.3794421
permanent~82 |  -.0004599   .0000689    -6.67   0.000    -.0005949   -.0003248
diff_mortg~e |  -7.166604   .9474631    -7.56   0.000    -9.023597   -5.309611
appreciation |   9.514355   3.162537     3.01   0.003     3.315895    15.71281
------------------------------------------------------------------------------

What is important here is that the coefficients of "mean_reduct"
(0.0129407) and "max_red" (0.1713769) are positive and highly
significant. They imply that if you increase the dosage of these
harmful substances by the same amount, they both increase the hazard
for survival, but compared to saturated fats, smoke is more risky to
lung-cancer patients.

Next, and based on this model, we would like to predict what would be
the survival rates for a dosage mean_reduct=20, max_red=10
at the sample mean and for time_index=100. We can run the following command:

. margins if time_index==100, at(mean_reduct=20 max_red=10) atmeans
predict(nohr)

Adjusted predictions                              Number of obs   =       1262
Model VCE    : OIM

Expression   : Relative hazard, predict(nohr)
at           : mean_reduct     =          20
max_red         =          10
red~x_reduct    =   -32.56141 (mean)
rent_net8       =    70.61078 (mean)
diff_stdma~a    =   -.4000001 (mean)
permanent~82    =    1116.988 (mean)
diff_mortg~e    =   -.0280243 (mean)
appreciation    =           0 (mean)

------------------------------------------------------------------------------
|            Delta-method
|     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons |   3.680264   .8105245     4.54   0.000     2.091666    5.268863
------------------------------------------------------------------------------

Unfortunately, the -margins- command does not give me the survival
rates even if I put predict (basesurv).

I can run the following commands, which constract a vector of survival
rates for each sample-period:

. predict full,basesurv
(8405 missing values generated)
. collapse (mean) full if fail==1,by(time_index)

The problem is that if we run these commands on the original model,
the projected survival rates will be valid for zero values of all the
variables. Therefore, we need to define the variables of the model
again where from each variable we subtract the value we would like to
predict. For example:

gen max_red1=max_red-20
gen mean_reduct1=mean_reduct-10

etc. Then we run the model again and construct the projected survival rates.

On Wed, Oct 26, 2011 at 11:54 AM, Änne Glass <aenne.glass@uni-rostock.de> wrote:
> Hello Statalist,
>
> we are interested in doing a landmark survival analysis with Stata(10.1),
> comparing therapy1 vs therapy2, before and after a landmark (t=24 months).
> Our data table consists of id, survivalTime, survivalStatus, therapyGroup.
>
> http://www.stata.com/statalist/archive/2011-02/msg00207.html we did
> 1)  -stset- for declaring data to be survival-time data with _st, _d, __t,
> _t0. (done)
> 2)  -stsplit- to split data into 2 time-span records with the landmark t=24,
> that means one pre-landmark and one post-landmark. This step modified our
> data table by adding one line for those ids going over 24 months. (done)
> 3)  -sts graph- command plotted the Kaplan-Meier failure function for the
> whole time-span (0-102 months) for both treatments. (done, but not desired)
>
> A step-by-step approach would be great, as we are not yet that proficient in
> Stata.
> Many thanks in advance - Aenne.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>On Wed, Oct 26, 2011 at 11:54 AM, Änne Glass <aenne.glass@uni-rostock.de> wrote:
> Hello Statalist,
>
> we are interested in doing a landmark survival analysis with Stata(10.1),
> comparing therapy1 vs therapy2, before and after a landmark (t=24 months).
> Our data table consists of id, survivalTime, survivalStatus, therapyGroup.
>
> http://www.stata.com/statalist/archive/2011-02/msg00207.html we did
> 1)  -stset- for declaring data to be survival-time data with _st, _d, __t,
> _t0. (done)
> 2)  -stsplit- to split data into 2 time-span records with the landmark t=24,
> that means one pre-landmark and one post-landmark. This step modified our
> data table by adding one line for those ids going over 24 months. (done)
> 3)  -sts graph- command plotted the Kaplan-Meier failure function for the
> whole time-span (0-102 months) for both treatments. (done, but not desired)
>
> A step-by-step approach would be great, as we are not yet that proficient in
> Stata.
> Many thanks in advance - Aenne.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
Dr. Yuval Arbel