## In the spotlight: Easy-to-interpret, flexible survival-time treatment effects

### Introduction

Is smoking bad for men who have already had a heart attack? You may already have an idea about the answer. But if you are a researcher studying smoking and heart disease, you know that this question is too general. One question with a statistical answer is “By how much will smoking reduce the time to a second heart attack among men aged 45–55 who have already had a heart attack? And, by the way, you clearly cannot require men to smoke”.

This latter question highlights (1) that we must be precise when estimating effects, (2) that the survival analysis of nonnegative, right-censored data is essential for quantifying the effect of this treatment, and (3) that we frequently must use observational data for ethical reasons.

We use observational data out of necessity, but the randomized experiment defines the treatment effect. For our example, the treatment effect is a population measure of a comparison between what happens when everyone smokes instead of when no one smokes. Because each individual either smokes or does not smoke, we see only one of the two potential outcomes. When the treatment is randomized, the missing potential outcome is missing completely at random. With observational data, we assume that the potential outcome is missing at random, after conditioning on covariates.

In this spotlight, I reiterate the well-known point that the effect
estimable from the Cox model parameters is difficult to interpret
for nontechnical audiences. I illustrate why the Cox model is less
flexible for estimating treatment effects than many researchers
believe. Finally, I show how to use the new **stteffects** command
to flexibly estimate an easy-to-interpret treatment effect.

### What Cox can and cannot tell us

We have simulated data on the time to second heart attack (**atime**), a
binary failure indicator (**fail**), a binary indicator for whether the
man smokes (**smoke**), the manâ€™s age in decades (**age**), an exercise
index (**exercise**), and a diet index (**diet**). These data
have already been **stset** with **atime** as the time to
event and **fail** as the failure indicator.

Many researchers would begin a survival analysis by estimating the parameters of a Cox model. The probability that an event will occur in the next moment, given that it has not yet happened, is the hazard function. The Cox model parameterizes the effect of covariates on the hazard function as a multiplicative factor.

Below, we estimate the parameters of a Cox model for our data.

.stcox smoke age exercise dietCox regression -- no ties No. of subjects = 5,000 Number of obs = 5,000 No. of failures = 2,969 Time at risk = 10972.84266 LR chi2(4) = 271.77 Log likelihood = -21963.163 Prob > chi2 = 0.0000

_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] | |

smoke | 1.540071 .0764791 8.70 0.000 1.397239 1.697505 | |

age | 2.024237 .1946491 7.33 0.000 1.676527 2.444062 | |

exercise | .5473001 .0454893 -7.25 0.000 .465026 .6441304 | |

diet | .4590354 .0379597 -9.42 0.000 .3903521 .5398037 | |

The output indicates that smoking increases the hazard of a second
heart attack by a factor of 1.5. Because **smoke** is not interacted
with the other covariates, the hazard ratio is constant.

Though researchers become accustomed to the hazard ratios, it is notoriously difficult to explain to patients, policy makers, and other nontechnical people. I, as a technical person, can back out how bad a factor of 1.5 might be, but I would prefer an effect measured in the time to failure.

When it is constant, the hazard ratio from these observational data
has the same interpretation as the hazard ratio we would obtain if
the treatment **smoke** was randomized, given that the missing
potential outcome is missing at random conditional on the
covariates. When the treatment **smoke** is interacted with the other
covariates, the hazard ratio varies over covariate patterns, and the
Cox model parameters cannot be used by themselves to recover the
hazard ratio when smoking is randomized.

Below, we estimate the parameters of a Cox model in which **smoke** is
interacted with the other covariates.

.stcox ibn.smoke#c.(age exercise diet)Cox regression -- no ties No. of subjects = 5,000 Number of obs = 5,000 No. of failures = 2,969 Time at risk = 10972.84266 LR chi2(6) = 223.11 Log likelihood = -21987.493 Prob > chi2 = 0.0000

_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] | |

smoke#c.age | ||

Nonsmoker | 1.714749 .1751413 5.28 0.000 1.403655 2.094791 | |

Smoker | 3.979649 1.110035 4.95 0.000 2.303673 6.874936 | |

smoke# | ||

c.exercise | ||

Nonsmoker | .5514891 .0476827 -6.88 0.000 .4655224 .6533309 | |

Smoker | .2839313 .0822003 -4.35 0.000 .1609844 .5007752 | |

smoke#c.diet | ||

Nonsmoker | .4461597 .0389598 -9.24 0.000 .3759769 .5294433 | |

Smoker | .6908017 .1785842 -1.43 0.152 .416201 1.146578 | |

The slope parameters of **smoke** differ across levels of
**exercise** and the **diet** index. The formal test reported
below supports this conclusion.

.contrast smoke#c.age smoke#c.exercise smoke#c.diet, overallContrasts of marginal linear predictions Margins : asbalanced

df chi2 P>chi2 | ||

smoke#c.age | 1 8.05 0.0045 | |

smoke#c.exercise | 1 4.83 0.0280 | |

smoke#c.diet | 1 2.58 0.1083 | |

Overall | 3 21.91 0.0001 | |

The overall test rejects that the effect of smoking is independent
of **age**, **exercise**, and **diet**. We cannot obtain
the randomized hazard-ratio effect from these Cox model parameters
because the hazard ratio is not constant.

### Easy to interpret and easy to estimate

So instead of estimating the difficult-to-interpret hazard ratio, we estimate the average difference in the time to second heart attack if everyone smoked instead of if no one smoked. This effect is known as the average treatment effect (ATE), and it is defined as

ATE = *E* [*t _{smoke}* −

*t*]

_{not_smoke}
where *t _{smoke}* is the time to
second heart attack when a person smokes and

*t*is the time to second heart attack when that person does not smoke. The expected value computes the mean of these individual-level differences in the population.

_{not_smoke}
We can use a model for the outcome, a model for the treatment, or
both models to account for the missing potential outcome, because
either *t _{smoke}* or

*t*is missing at random, conditional on covariates.

_{not_smoke}
The regression-adjustment (RA) estimator uses a model for the
outcome to account for the missing potential outcome. The data lost
to censoring are handled in the log-likelihood function. Below, we
use **stteffects ra** to estimate the ATE.

.stteffects ra (age exercise diet) (smoke)Survival treatment-effects estimation Number of obs = 5,000 Estimator : regression adjustment Outcome model : Weibull Treatment model: none Censoring model: none

Robust | ||

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval] | |

ATE | ||

smoke | ||

(Smoker | ||

vs | ||

Nonsmoker) | -1.520671 .2011014 -7.56 0.000 -1.914822 -1.126519 | |

POmean | ||

smoke | ||

Nonsmoker | 4.057439 .1028462 39.45 0.000 3.855864 4.259014 | |

The average time to a second heart attack would be 1.5 years earlier when everyone smokes than the 4.0 years that would be observed if no one smoked.

These estimates are much easier to interpret than the hazard ratio,
and the underlying model implicitly interacts the treatment with all
the covariates. Instead of the default Weibull distribution, we
could have used the gamma distribution or the log normal
distribution to model the outcome. **stteffects ra** allows us to
parameterize the second parameter in each of these distributions to
produce flexible outcome models.

The inverse-probability-weighted (IPW) estimator uses models for the
treatment assignment and the censoring process to account for the
missing data. Below, we use **stteffects ipw** to estimate the
ATE.

.stteffects ipw (smoke age exercise diet) (age exercise diet)Survival treatment-effects estimation Number of obs = 5,000 Estimator : inverse-probability weights Outcome model : weighted mean Treatment model: logit Censoring model: Weibull

Robust | ||

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval] | |

ATE | ||

smoke | ||

(Smoker | ||

vs | ||

Nonsmoker) | -1.689397 .3373219 -5.01 0.000 -2.350536 -1.028258 | |

POmean | ||

smoke | ||

Nonsmoker | 4.200135 .2156737 19.47 0.000 3.777423 4.622848 | |

The average time to a second heart attack would be 1.7 years earlier when everyone smokes than the 4.2 years that would be observed if no one smoked. As for the RA estimator, other distributions and flexible parameterizations are available for the treatment model and the censoring-process model.

The inverse-probability-weighted regression-adjustment estimator
implemented in **stteffect ipwra** uses models for the outcome
and the treatment, and optionally the censoring process, to increase
efficiency. It offers the flexibility of both the RA and the IPW
estimator.

### Conclusion and further reading

With survival data, many researchers default to a Cox model, even
though the effect estimable from the Cox model parameters is
difficult to interpret for nontechnical audiences. The Cox model is
less flexible for estimating treatment effects than many
researchers believe. The new **stteffects** command can flexibly
estimate an easy-to-interpret treatment effect. For more
information, see [TE] **stteffects**.

—David M. Drukker

Director of Econometrics