In the spotlight: Easy-to-interpret, flexible survival-time treatment effects

Introduction

Is smoking bad for men who have already had a heart attack? You may already have an idea about the answer. But if you are a researcher studying smoking and heart disease, you know that this question is too general. One question with a statistical answer is “By how much will smoking reduce the time to a second heart attack among men aged 45–55 who have already had a heart attack? And, by the way, you clearly cannot require men to smoke”.

This latter question highlights (1) that we must be precise when estimating effects, (2) that the survival analysis of nonnegative, right-censored data is essential for quantifying the effect of this treatment, and (3) that we frequently must use observational data for ethical reasons.

We use observational data out of necessity, but the randomized experiment defines the treatment effect. For our example, the treatment effect is a population measure of a comparison between what happens when everyone smokes instead of when no one smokes. Because each individual either smokes or does not smoke, we see only one of the two potential outcomes. When the treatment is randomized, the missing potential outcome is missing completely at random. With observational data, we assume that the potential outcome is missing at random, after conditioning on covariates.

In this spotlight, I reiterate the well-known point that the effect estimable from the Cox model parameters is difficult to interpret for nontechnical audiences. I illustrate why the Cox model is less flexible for estimating treatment effects than many researchers believe. Finally, I show how to use the new stteffects command to flexibly estimate an easy-to-interpret treatment effect.

What Cox can and cannot tell us

We have simulated data on the time to second heart attack (atime), a binary failure indicator (fail), a binary indicator for whether the man smokes (smoke), the man’s age in decades (age), an exercise index (exercise), and a diet index (diet). These data have already been stset with atime as the time to event and fail as the failure indicator.

Many researchers would begin a survival analysis by estimating the parameters of a Cox model. The probability that an event will occur in the next moment, given that it has not yet happened, is the hazard function. The Cox model parameterizes the effect of covariates on the hazard function as a multiplicative factor.

Below, we estimate the parameters of a Cox model for our data.

. stcox smoke age exercise diet
Cox regression -- no ties

No. of subjects =        5,000                  Number of obs    =       5,000
No. of failures =        2,969
Time at risk    =  10972.84266
                                                LR chi2(4)       =      271.77
Log likelihood  =   -21963.163                  Prob > chi2      =      0.0000



          _t   Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]

       smoke     1.540071   .0764791     8.70   0.000     1.397239    1.697505
         age     2.024237   .1946491     7.33   0.000     1.676527    2.444062
    exercise     .5473001   .0454893    -7.25   0.000      .465026    .6441304
        diet     .4590354   .0379597    -9.42   0.000     .3903521    .5398037

The output indicates that smoking increases the hazard of a second heart attack by a factor of 1.5. Because smoke is not interacted with the other covariates, the hazard ratio is constant.

Though researchers become accustomed to the hazard ratios, it is notoriously difficult to explain to patients, policy makers, and other nontechnical people. I, as a technical person, can back out how bad a factor of 1.5 might be, but I would prefer an effect measured in the time to failure.

When it is constant, the hazard ratio from these observational data has the same interpretation as the hazard ratio we would obtain if the treatment smoke was randomized, given that the missing potential outcome is missing at random conditional on the covariates. When the treatment smoke is interacted with the other covariates, the hazard ratio varies over covariate patterns, and the Cox model parameters cannot be used by themselves to recover the hazard ratio when smoking is randomized.

Below, we estimate the parameters of a Cox model in which smoke is interacted with the other covariates.

. stcox ibn.smoke#c.(age exercise diet)
Cox regression -- no ties

No. of subjects =        5,000                  Number of obs    =       5,000
No. of failures =        2,969
Time at risk    =  10972.84266
                                                LR chi2(6)       =      223.11
Log likelihood  =   -21987.493                  Prob > chi2      =      0.0000



          _t   Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]

 smoke#c.age  
  Nonsmoker      1.714749   .1751413     5.28   0.000     1.403655    2.094791
     Smoker      3.979649   1.110035     4.95   0.000     2.303673    6.874936
              
       smoke#  
  c.exercise  
  Nonsmoker      .5514891   .0476827    -6.88   0.000     .4655224    .6533309
     Smoker      .2839313   .0822003    -4.35   0.000     .1609844    .5007752
              
smoke#c.diet  
  Nonsmoker      .4461597   .0389598    -9.24   0.000     .3759769    .5294433
     Smoker      .6908017   .1785842    -1.43   0.152      .416201    1.146578

The slope parameters of smoke differ across levels of exercise and the diet index. The formal test reported below supports this conclusion.

. contrast smoke#c.age smoke#c.exercise smoke#c.diet, overall
Contrasts of marginal linear predictions

Margins      : asbalanced



                           df        chi2     P>chi2
  
     smoke#c.age            1        8.05     0.0045
                  
smoke#c.exercise            1        4.83     0.0280
                  
    smoke#c.diet            1        2.58     0.1083
                  
         Overall            3       21.91     0.0001

The overall test rejects that the effect of smoking is independent of age, exercise, and diet. We cannot obtain the randomized hazard-ratio effect from these Cox model parameters because the hazard ratio is not constant.

Easy to interpret and easy to estimate

So instead of estimating the difficult-to-interpret hazard ratio, we estimate the average difference in the time to second heart attack if everyone smoked instead of if no one smoked. This effect is known as the average treatment effect (ATE), and it is defined as

ATE = E [t_smoke − t_{not_smoke}]

where t_smoke is the time to second heart attack when a person smokes and t_{not_smoke} is the time to second heart attack when that person does not smoke. The expected value computes the mean of these individual-level differences in the population.

We can use a model for the outcome, a model for the treatment, or both models to account for the missing potential outcome, because either t_smoke or t_{not_smoke} is missing at random, conditional on covariates.

The regression-adjustment (RA) estimator uses a model for the outcome to account for the missing potential outcome. The data lost to censoring are handled in the log-likelihood function. Below, we use stteffects ra to estimate the ATE.

. stteffects ra (age exercise diet) (smoke) 
Survival treatment-effects estimation           Number of obs     =      5,000
Estimator      : regression adjustment
Outcome model  : Weibull
Treatment model: none
Censoring model: none


                             Robust
          _t        Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
   
ATE           
       smoke  
    (Smoker   
         vs   
 Nonsmoker)     -1.520671   .2011014    -7.56   0.000    -1.914822   -1.126519
   
POmean        
       smoke  
  Nonsmoker      4.057439   .1028462    39.45   0.000     3.855864    4.259014

The average time to a second heart attack would be 1.5 years earlier when everyone smokes than the 4.0 years that would be observed if no one smoked.

These estimates are much easier to interpret than the hazard ratio, and the underlying model implicitly interacts the treatment with all the covariates. Instead of the default Weibull distribution, we could have used the gamma distribution or the log normal distribution to model the outcome. stteffects ra allows us to parameterize the second parameter in each of these distributions to produce flexible outcome models.

The inverse-probability-weighted (IPW) estimator uses models for the treatment assignment and the censoring process to account for the missing data. Below, we use stteffects ipw to estimate the ATE.

. stteffects ipw (smoke age exercise diet) (age exercise diet)
Survival treatment-effects estimation           Number of obs     =      5,000
Estimator      : inverse-probability weights
Outcome model  : weighted mean
Treatment model: logit
Censoring model: Weibull

                             Robust
          _t        Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

ATE           
       smoke  
    (Smoker   
         vs   
 Nonsmoker)     -1.689397   .3373219    -5.01   0.000    -2.350536   -1.028258

POmean        
       smoke  
  Nonsmoker      4.200135   .2156737    19.47   0.000     3.777423    4.622848

The average time to a second heart attack would be 1.7 years earlier when everyone smokes than the 4.2 years that would be observed if no one smoked. As for the RA estimator, other distributions and flexible parameterizations are available for the treatment model and the censoring-process model.

The inverse-probability-weighted regression-adjustment estimator implemented in stteffect ipwra uses models for the outcome and the treatment, and optionally the censoring process, to increase efficiency. It offers the flexibility of both the RA and the IPW estimator.

Conclusion and further reading

With survival data, many researchers default to a Cox model, even though the effect estimable from the Cox model parameters is difficult to interpret for nontechnical audiences. The Cox model is less flexible for estimating treatment effects than many researchers believe. The new stteffects command can flexibly estimate an easy-to-interpret treatment effect. For more information, see [TE] stteffects.

—David M. Drukker
Director of Econometrics