  # RE: st: Confidence limits on survival probabilities from stcox

 From May Boggess To statalist@hsphsun2.harvard.edu Subject RE: st: Confidence limits on survival probabilities from stcox Date 01 Feb 2005 10:30:35 -0600

```On Tuesday, Neville wrote:
>
> Thanks for Ricardo's idea, but there are two problems: one that the
> estimates obtained are not the same as those that stcurve plots (perhaps the
> fact that my covariates are categorical variables with more than 2 levels I
> may have used the adjustfor option in the wrong way - I put in the dummy
> variables, except the baseline ones - so I stand to be corrected on this
> one) and two, that one only gets a listing at the time points in the dataset
> and not at all the time points.
>
> ... I can't reproduce the values in the output file
> cox1.dta using the sts list command - I've tried using both the by and
> adjustfor commands. Assuming one can reproduce the values, how does one
> overcome point two above?

Let me answer Neville's original question before getting onto the points
raised here: there is no command for calculating confidence intervals
around the survival curve from a Cox proportional hazard model with
covariates. Therneau and Grambsch "Modeling Survival Data" describe some
methods for calculating standard errors (page 268) but are less than
enthusiastic about the value of some of the confidence intervals
obtained.

Neville's first question today is how to get the same graph from
-sts, adjust()- and -stcurve- following -stcox-. The thing to keep in
mind is that -sts- evaluates at all covariates set to zero, whereas
-stcurve- evaluates at the mean of the covariates. So to see them
produce the same answer, we need to subtract the mean from each
covariate so that the means are zero. Here is an example. First the Cox
model:

clear
sysuse cancer
stset st,f(di)
stcox age drug, bases(surv)
stcurve, outfile(results, replace) surv
use results, clear
list
duplicates drop
sort _t
twoway line surv1 _t, connect(stairstep)
save results, replace

We see that I get the same plot using -twoway- on the saved results
as we did from -stcurve-. I dropped duplicate observations since they
don't add anything to the graph. Now -sts-:

clear
sysuse cancer
stset st,f(di)
sum age
replace age=age-r(mean)
sum drug
replace drug=drug-r(mean)

We should see that the graph looks the same as the one we got above.
We can compare the actual values of the function by using -sts gen-:

keep surv2 _t
duplicates drop
count
sort _t
twoway line surv2 _t, connect(stairstep)

merge using results
list

Neville's second question is how to get the values of the survival
function for time points that are not in the original dataset. If the
survival function was calculated after estimation by -predict- (like it
is for -streg-) we know what we would do: after the estimation we would
put into the dataset the times we are interested in and then use
-predict-. But since with -stcox- the survival function is worked at at
the time of estimation, this option is not available to us.

However, since the survival function is a step function we can deduce
the value of the function at anytime we are interested in. In my example
above I had S(28)=.0481023  and S(33)=.0108086. So for any time t
between 28 and 33, S(t)=.0481023.

-- May
mboggess@stata.com

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```