# RE: st: Stcurve following stcox, strata

 From May Boggess To statalist@hsphsun2.harvard.edu Subject RE: st: Stcurve following stcox, strata Date 07 Apr 2004 10:36:34 -0500

```On Tuesday, Trey asked how to plot survival curves after a stratified
Cox regression. The command -stcurve- can be used for plotting
survival and hazard curves after -stcox- and -streg- in most
circumstances.

But if we have a more complicated model than usual, say we have
continuously time-varying covariates, or, as in Trey's case, the model
is stratified, we need to generate the survival function and plot it
using -twoway- as follows:

clear
sysuse cancer
drop if drug==3
qui stset st,failure(di)
sum age
gen meanage=age-r(mean)

stcox meanage, basesurv(baseline) strata(drug) nolog
gen s1=baseline if drug==1
label var s1 "drug=1"
gen s2=baseline if drug==2
label var s2 "drug=2"
sort _t
twoway line s1 _t, connect(stairstep) || /*
*/  line s2 _t, connect(stairstep) ||, /*
*/ legend(rows(1)) title("Survival at average age")/*
*/ xtitle("analysis time")

In this example I am plotting the survival curves for each strata.

Trey also has a binary covariate and wants to plot the survival curves,
for each strata, at each covariate value, as follows:

clear
sysuse cancer
drop if drug==3
set seed 12345
gen x1 =uniform()>0.5
stset studytime,failure(died)
sort _t
sum age
gen meanage=age-r(mean)

stcox x1 meanage, strata(drug) basesurv(baseline) nolog
gen s1=baseline^exp(_b[x1]*x1) if drug==1
label var s1 "drug=1"
gen s2=baseline^exp(_b[x1]*x1) if drug==2
label var s2 "drug=2"

twoway line s1 _t, connect(stairstep) || /*
*/ line s2 _t, connect(stairstep)||, by(x1)

There are  number of different ways to set out the plot, but the
important point to note is that the survival function is different for
each strata, so if you just try to plot it for the two values of x1, you
don't get the plot you are looking for:

clear
sysuse cancer
drop if drug==3
set seed 12345
gen x1 =uniform()>0.5
stset studytime,failure(died)
sort _t
sum age
gen meanage=age-r(mean)
stcox x1 meanage, strata(drug) basesurv(baseline) nolog
gen s=baseline^exp(_b[x1]*x1)
twoway line s _t, by(x1)

If you want to put all four survival functions on one plot, you just
break the baseline up into four bits:

clear
sysuse cancer
drop if drug==3
set seed 12345
gen x1 =uniform()>0.5
stset studytime,failure(died)
sort _t
sum age
gen meanage=age-r(mean)
stcox x1 meanage, strata(drug) basesurv(baseline) nolog
gen s11=baseline^exp(_b[x1]) if drug==1 & x1==1
label var s11 "drug=1, x1=1"
gen s21=baseline^exp(_b[x1]) if drug==2  & x1==1
label var s21 "drug=2, x1=1"
gen s10=baseline if drug==1 & x1==0
label var s10 "drug=1, x1=0"
gen s20=baseline if drug==2  & x1==0
label var s20 "drug=2, x1=0"
twoway line  s11 _t, connect(stairstep) || /*
*/ line s21 _t, connect(stairstep)||/*
*/ line s10 _t, connect(stairstep)||/*
*/ line s20 _t, connect(stairstep)

Lastly, you may be wondering why I created the variable meanage, and
used that in the -stcox-, rather than just using age. The baseline
survival function is the the survival evaluated with all the covariates
set to zero. If the covariate equal to zero is a sensible value for your
dataset, then you expect sensible values. But in my example, age=0 was
not anywhere near the values in my dataset. The range of age was from 47
to 67. By shifting my age variable, I am protecting myself against
numerical problems that sometimes occur when we calculate the baseline
at a value way outside the range of the data. To see how I would
do the above without this shift:

clear
sysuse cancer
drop if drug==3
set seed 12345
gen x1 =uniform()>0.5
stset studytime,failure(died)
sort _t
gen copy=age

stcox x1 age, strata(drug) basesurv(baseline) nolog
sum age
replace age=r(mean)
predict xb,xb
gen t11=baseline^exp(xb) if drug==1 & x1==1
gen t21=baseline^exp(xb) if drug==2  & x1==1
gen t10=baseline^exp(xb) if drug==1 & x1==0
gen t20=baseline^exp(xb) if drug==2  & x1==0

drop baseline xb
replace age=copy
sum age
gen meanage=age-r(mean)
stcox x1 meanage, strata(drug) basesurv(baseline) nolog
gen s11=baseline^exp(_b[x1]) if drug==1 & x1==1
gen s21=baseline^exp(_b[x1]) if drug==2  & x1==1
gen s10=baseline if drug==1 & x1==0
gen s20=baseline if drug==2  & x1==0

sum t11 s11
sum t21 s21
sum t10 s10
sum t20 s20

-- May
mmb@stata.com

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```