 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

st: A Question Regarding the Cox Regression and Projected Survival Rates

 From Yuval Arbel To statalist@hsphsun2.harvard.edu Subject st: A Question Regarding the Cox Regression and Projected Survival Rates Date Thu, 20 Oct 2011 17:11:39 +0200

"mean_reduct" and "max_red" in the output appended below (immediately
after the question)

Note, that compared to the coefficient of "max_red" (19.26x(10^(-2))),
the coefficient of "mean reduct" (3.53x(10^(-2))) is approx. 5 times
smaller. Needless to say that both coefficients are highly
significant.

According to my best understanding of the STATA manual, we anticipate
a much bigger increase in the hazard to survival when everything else
is equal, for a 1-unit increase in "max_red" instead of "mean_reduct"

Yet, when I'm trying to translate these outcomes to projected survival
rates, they seem to be inconsistent with the above interpretation of
the coefficients. As you can see from the output below, I used the
option "basesurv" to construct two vectors of projected survival rates
for the sample mean (and for max_red=mean_red=10). The name of these
vectors are  "max_omit" and "self_omit", which correspond to
"mean_reduct" and "max_red" in the regression model. I made sure that
these vectors are obtained from a Cox regression with the same
coefficients. Finally, I collapsed the mean of the survival rates into
the mean of 103 sample-periods (there are in fact 114 sample periods,
but failures start from period 12).

Note, that the mean survival rate of "max_omit" across all sample
periods is 78.28x(10^(-2)) and of "self_omit" is 99.51x(10^(-2)). This
is precisely the opposite from what I would anticipate from the Cox
regression. The average projected survival rate across all periods of
"self_omit" should be smaller then 78%. Moreover, in the last period
of the sample, the projected survival rate of "max_omit" is 0% and
projected survival rate of "self_omit" is 49.99%!!! Again, this stands
in contrast to the Cox regression outcomes

My question is: what am I missing here? how can I explain this
apparent inconsistency?

Yours sincerely,

Yuval

. do "G:\public housing\increasing_experiment_average.do"

. clear

. clear matrix

. set memory 500m
(512000k)

. set matsize 800

. use "g:\public housing\test_sample_May_07_Bought.dta", clear

. stcox mean_reduct reductcurrent_mean_reduct rent_net8
> ppreciation,nohr

failure _d:  fail == 1
analysis time _t:  time_index
id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74721.874
Iteration 2:   log likelihood = -74566.501
Iteration 3:   log likelihood = -74561.567
Iteration 4:   log likelihood = -74561.555
Refining estimates:
Iteration 0:   log likelihood = -74561.555

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
LR chi2(7)      =   7613.39
Log likelihood  =   -74561.555                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
_t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_reduct |   .0353358   .0005278    66.94   0.000     .0343012    .0363703
reductcurr~t |   .0221957   .0005134    43.23   0.000     .0211894    .0232019
rent_net8 |   .0025506   .0001655    15.41   0.000     .0022263    .0028749
diff_stdma~a |  -.4642809   .0446886   -10.39   0.000    -.5518688   -.3766929
permanent~82 |  -.0004675   .0000689    -6.79   0.000    -.0006025   -.0003325
diff_mortg~e |  -6.430141   .8913818    -7.21   0.000    -8.177217   -4.683064
appreciation |   9.629971   3.161657     3.05   0.002     3.433237     15.8267
------------------------------------------------------------------------------

.
. gen max_red=0

. replace max_red=75 if time_index>=0 & time_index<=14

. replace max_red=95 if time_index>=15 & time_index<=93

. replace max_red=90 if time_index>=94 & time_index<=95

. replace max_red=92 if time_index>=96 & time_index<=114

.
. gen  reductcurrent_max_reduct=reduct_per-max_red

permanentincomeestimate82 diff_mortgage apprec
> iation,nohr

failure _d:  fail == 1
analysis time _t:  time_index
id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74893.557
Iteration 2:   log likelihood = -74745.467
Iteration 3:   log likelihood = -74741.532
Iteration 4:   log likelihood = -74741.525
Refining estimates:
Iteration 0:   log likelihood = -74741.525

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
LR chi2(7)      =   7253.45
Log likelihood  =   -74741.525                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
_t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
max_red |   .1925865   .0203524     9.46   0.000     .1526964    .2324765
red~x_reduct |   .0288315   .0004167    69.19   0.000     .0280148    .0296482
rent_net8 |   .0027994   .0001633    17.15   0.000     .0024794    .0031194
diff_stdma~a |   -.482652   .0435363   -11.09   0.000    -.5679816   -.3973225
permanent~82 |  -.0003909   .0000691    -5.65   0.000    -.0005264   -.0002554
diff_mortg~e |  -6.445779   .8805713    -7.32   0.000    -8.171667   -4.719891
appreciation |   5.424572   3.212262     1.69   0.091    -.8713461    11.72049
------------------------------------------------------------------------------

permanentincomeestimate211 appreciation10011

.
.
. gen rent_net11=rent_net8-60.45422

(1 missing value generated)

.
. gen permanentincomeestimate211=permanentincomeestimate82-1107.764

.
. gen diff_mortgage11=diff_mortgage+.000497

. gen appreciation10011=appreciation-.0016098

.
. gen max_reduct_actual1=max_red-10

. gen reductcurrent_max_actual1=reductcurrent_max_reduct

. stcox max_reduct_actual1 reductcurrent_max_actual1
> estimate211 appreciation10011,nohr

failure _d:  fail == 1
analysis time _t:  time_index
id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74893.557
Iteration 2:   log likelihood = -74745.467
Iteration 3:   log likelihood = -74741.532
Iteration 4:   log likelihood = -74741.525
Refining estimates:
Iteration 0:   log likelihood = -74741.525

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
LR chi2(7)      =   7253.45
Log likelihood  =   -74741.525                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
_t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
max_reduct~1 |   .1925865   .0203524     9.46   0.000     .1526964    .2324765
reductcur~l1 |   .0288315   .0004167    69.19   0.000     .0280148    .0296482
diff_stdm~11 |   -.482652   .0435363   -11.09   0.000    -.5679815   -.3973225
rent_net11 |   .0027994   .0001633    17.15   0.000     .0024794    .0031194
diff_mort~11 |  -6.445778   .8805713    -7.32   0.000    -8.171666    -4.71989
permanen~211 |  -.0003909   .0000691    -5.65   0.000    -.0005264   -.0002554
apprec~10011 |   5.424572   3.212262     1.69   0.091    -.8713462    11.72049
------------------------------------------------------------------------------

. predict self_omit,basesurv
(8405 missing values generated)

.
. gen  mean_reduct_actual1=mean_reduct-10

. drop   reductcurrent_mean_reduct1

. gen reductcurrent_mean_reduct1=reductcurrent_mean_reduct

. stcox mean_reduct_actual1 reductcurrent_mean_reduct1
> meestimate211 appreciation10011,nohr

failure _d:  fail == 1
analysis time _t:  time_index
id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74721.874
Iteration 2:   log likelihood = -74566.501
Iteration 3:   log likelihood = -74561.567
Iteration 4:   log likelihood = -74561.555
Refining estimates:
Iteration 0:   log likelihood = -74561.555

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
LR chi2(7)      =   7613.39
Log likelihood  =   -74561.555                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
_t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_redu~l1 |   .0353358   .0005278    66.94   0.000     .0343012    .0363703
reductcur~t1 |   .0221957   .0005134    43.23   0.000     .0211894    .0232019
diff_stdm~11 |  -.4642809   .0446886   -10.39   0.000    -.5518688   -.3766929
rent_net11 |   .0025506   .0001655    15.41   0.000     .0022263    .0028749
diff_mort~11 |   -6.43014   .8913818    -7.21   0.000    -8.177217   -4.683064
permanen~211 |  -.0004675   .0000689    -6.79   0.000    -.0006025   -.0003325
apprec~10011 |   9.629971   3.161657     3.05   0.002     3.433236     15.8267
------------------------------------------------------------------------------

. predict max_omit,basesurv
(8405 missing values generated)

.
. gen diff=reductcurrent_mean_reduct-max_reduct

. stcox mean_reduct_actual1 max_reduct_actual1 diff
> stimate211 appreciation10011,nohr

failure _d:  fail == 1
analysis time _t:  time_index
id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74694.532
Iteration 2:   log likelihood = -74538.881
Iteration 3:   log likelihood = -74533.372
Iteration 4:   log likelihood = -74533.352
Iteration 5:   log likelihood = -74533.352
Refining estimates:
Iteration 0:   log likelihood = -74533.352

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
LR chi2(8)      =   7669.79
Log likelihood  =   -74533.352                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
_t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_redu~l1 |   .0352556    .000528    66.77   0.000     .0342207    .0362904
max_reduct~1 |   .1713769    .020039     8.55   0.000     .1321011    .2106527
diff |   .0223149   .0005149    43.34   0.000     .0213057     .023324
diff_stdm~11 |  -.4692028   .0457971   -10.25   0.000    -.5589636   -.3794421
rent_net11 |   .0025795   .0001659    15.55   0.000     .0022543    .0029047
diff_mort~11 |  -7.166604    .947463    -7.56   0.000    -9.023597    -5.30961
permanen~211 |  -.0004599   .0000689    -6.67   0.000    -.0005949   -.0003248
apprec~10011 |   9.514355   3.162538     3.01   0.003     3.315895    15.71281
------------------------------------------------------------------------------

. predict full,basesurv
(8405 missing values generated)

.
. collapse (mean) full self_omit max_omit if fail==1, by(time_index)

. gen t=_n

.
. ttest  max_omit=0

One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
max_omit |     103    .7828852    .0170734    .1732759    .7490202    .8167502
------------------------------------------------------------------------------
mean = mean(max_omit)                                         t =  45.8541
Ho: mean = 0                                     degrees of freedom =      102

Ha: mean < 0                 Ha: mean != 0                 Ha: mean > 0
Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 0.0000

. ttest  self_omit=0

One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
self_o~t |     103    .9951453    .0048544    .0492665    .9855167    1.004774
------------------------------------------------------------------------------
mean = mean(self_omit)                                        t = 204.9998
Ho: mean = 0                                     degrees of freedom =      102

Ha: mean < 0                 Ha: mean != 0                 Ha: mean > 0
Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 0.0000

. ttest max_omit=self_omit

Paired t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
max_omit |     103    .7828852    .0170734    .1732759    .7490202    .8167502
self_o~t |     103    .9951453    .0048544    .0492665    .9855167    1.004774
---------+--------------------------------------------------------------------
diff |     103   -.2122602    .0155096    .1574049   -.2430233    -.181497
------------------------------------------------------------------------------
mean(diff) = mean(max_omit - self_omit)                      t = -13.6858
Ho: mean(diff) = 0                              degrees of freedom =      102

Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

. ttest full=0

One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
full |     103    .9951447    .0048544    .0492666    .9855161    1.004773
------------------------------------------------------------------------------
mean = mean(full)                                             t = 204.9992
Ho: mean = 0                                     degrees of freedom =      102

Ha: mean < 0                 Ha: mean != 0                 Ha: mean > 0
Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 0.0000

. ttest full=self_omit

Paired t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
full |     103    .9951447    .0048544    .0492666    .9855161    1.004773
self_o~t |     103    .9951453    .0048544    .0492665    .9855167    1.004774
---------+--------------------------------------------------------------------
diff |     103   -6.44e-07    6.74e-08    6.84e-07   -7.78e-07   -5.10e-07
------------------------------------------------------------------------------
mean(diff) = mean(full - self_omit)                          t =  -9.5524
Ho: mean(diff) = 0                              degrees of freedom =      102

Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

.
end of do-file

--
Dr. Yuval Arbel