**[ST] stcox postestimation** -- Postestimation tools for stcox

__Postestimation commands__

The following postestimation commands are of special interest after
**stcox**:

Command Description
-------------------------------------------------------------------------
**estat concordance** compute the concordance probability
**estat phtest** test the proportional-hazards assumption
**stcoxkm** plot Kaplan-Meier observed survival and Cox predicted
curves
**stcurve** plot the survivor, hazard, and cumulative hazard
functions
**stphplot** plot -ln{-ln(survival)} curves
-------------------------------------------------------------------------
**estat** **concordance** is not appropriate after estimation with **svy**.

The following standard postestimation commands are also available:

Command Description
-------------------------------------------------------------------------
**contrast** contrasts and ANOVA-style joint tests of estimates
**estat ic** Akaike's and Schwarz's Bayesian information criteria
(AIC and BIC)
**estat summarize** summary statistics for the estimation sample
**estat vce** variance-covariance matrix of the estimators (VCE)
**estat** (svy) postestimation statistics for survey data
**estimates** cataloging estimation results
* **hausman** Hausman's specification test
**lincom** point estimates, standard errors, testing, and
inference for linear combinations of coefficients
**linktest** link test for model specification
* **lrtest** likelihood-ratio test
**margins** marginal means, predictive margins, marginal effects,
and average marginal effects
**marginsplot** graph the results from margins (profile plots,
interaction plots, etc.)
**nlcom** point estimates, standard errors, testing, and
inference for nonlinear combinations of coefficients
**predict** predictions, residuals, influence statistics, and
other diagnostic measures
**predictnl** point estimates, standard errors, testing, and
inference for generalized predictions
**pwcompare** pairwise comparisons of estimates
**test** Wald tests of simple and composite linear hypotheses
**testnl** Wald tests of nonlinear hypotheses
-------------------------------------------------------------------------
* **hausman** and **lrtest** are not appropriate with **svy** estimation results.

__Syntax for predict__

**predict** [*type*] *newvar* [*if*] [*in*] [**,** *sv_statistic* __nooff__**set** __part__**ial**]

**predict** [*type*] {*stub******|*newvarlist*} [*if*] [*in*] **,** *mv_statistic* [__part__**ial**]

*sv_statistic* Description
-------------------------------------------------------------------------
Main
**hr** predicted hazard ratio, also known as the relative
hazard; the default
**xb** linear prediction xb
**stdp** standard error of the linear prediction; SE(xb)
* __bases__**urv** baseline survivor function
* __basec__**hazard** baseline cumulative hazard function
* **basehc** baseline hazard contributions
* __mg__**ale** martingale residuals
* __csn__**ell** Cox-Snell residuals
* __dev__**iance** deviance residuals
* __ld__**isplace** likelihood displacement values
* __lm__**ax** LMAX measures of influence
* __eff__**ects** log frailties
-------------------------------------------------------------------------

*mv_statistic* Description
-------------------------------------------------------------------------
Main
* __sco__**res** efficient score residuals
* **esr** synonym for **scores**
* __dfb__**eta** DFBETA measures of influence
* __sch__**oenfeld** Schoenfeld residuals
* __sca__**ledsch** scaled Schoenfeld residuals
-------------------------------------------------------------------------
Unstarred statistics are available both in and out of sample; type
**predict** ... **if e(sample)** ... if wanted only for the estimation sample.
Starred statistics are calculated only for the estimation sample, even
when **e(sample)** is not specified. **nooffset** is allowed only with
unstarred statistics.
**mgale**, **csnell**, **deviance**, **ldisplace**, **lmax**, **dfbeta**, **schoenfeld**, and
**scaledsch** are not allowed with **svy** estimation results.

__Menu for predict__

**Statistics > Postestimation**

__Description for predict__

**predict** creates a new variable containing predictions such as hazard
ratios; linear predictions; standard errors; baseline survivor,
cumulative hazard, and hazard functions; martingale, Cox-Snell, deviance,
efficient score, Schoenfeld, and scaled Schoenfeld residuals; likelihood
displacement values; LMAX measures of influence; log frailties; and
DFBETA measures of influence.

__Options for predict__

+------+
----+ Main +-------------------------------------------------------------

**hr**, the default, calculates the relative hazard (hazard ratio), that is,
the exponentiated linear prediction.

**xb** calculates the linear prediction from the fitted model. That is, you
fit the model by estimating a set of parameters b0, b1, b2, ..., bk,
and the linear prediction is xb.

The x used in the calculation is obtained from the data currently in
memory and need not correspond to the data on the independent
variables used in estimating b.

**stdp** calculates the standard error of the prediction, that is, the
standard error of xb.

**basesurv** calculates the baseline survivor function. In the null model,
this is equivalent to the Kaplan-Meier product-limit estimate. If
**stcox**'s **strata()** option was specified, baseline survivor functions
for each stratum are provided.

**basechazard** calculates the cumulative baseline hazard. If **stcox**'s
**strata()** option was specified, cumulative baseline hazards for each
stratum are provided.

**basehc** calculates the baseline hazard contributions. These are used to
construct the product-limit type estimator for the baseline survivor
function generated by **basesurv**. If **stcox**'s **strata()** option was
specified, baseline hazard contributions for each stratum are
provided.

**mgale** calculates the martingale residuals. For
multiple-record-per-subject data, by default only one value per
subject is calculated, and it is placed on the last record for the
subject.

Adding the **partial** option will produce partial martingale residuals,
one for each record within subject; see **partial** below. Partial
martingale residuals are the additive contributions to a subject's
overall martingale residual. In single-record-per-subject data, the
partial martingale residuals are the martingale residuals.

**csnell** calculates the Cox-Snell generalized residuals. For
multiple-record data, by default only one value per subject is
calculated, and it is placed on the last record for the subject.

Adding the **partial** option will produce partial Cox-Snell residuals,
one for each record within subject; see **partial** below. Partial
Cox-Snell residuals are the additive contributions to a subject's
overall Cox-Snell residual. In single-record data, the partial
Cox-Snell residuals are the Cox-Snell residuals.

**deviance** calculates the deviance residuals. Deviance residuals are
martingale residuals that have been transformed to be more symmetric
about zero. For multiple-record data, by default only one value per
subject is calculated, and it is placed on the last record for the
subject.

Adding the **partial** option will produce partial deviance residuals,
one for each record within subject; see **partial** below. Partial
deviance residuals are transformed partial martingale residuals. In
single-record data, the partial deviance residuals are the deviance
residuals.

**ldisplace** calculates the likelihood displacement values. A likelihood
displacement value is an influence measure of the effect of deleting
a subject on the overall coefficient vector. For multiple-record
data, by default only one value per subject is calculated, and it is
placed on the last record for the subject.

Adding the **partial** option will produce partial likelihood
displacement values, one for each record within subject; see **partial**
below. Partial displacement values are interpreted as effects due to
deletion of individual records rather than deletion of individual
subjects. In single-record data, the partial likelihood displacement
values are the likelihood displacement values.

**lmax** calculates the LMAX measures of influence. LMAX values are related
to likelihood displacement values because they also measure the
effect of deleting a subject on the overall coefficient vector. For
multiple-record data, by default only one LMAX value per subject is
calculated, and it is placed on the last record for the subject.

Adding the **partial** option will produce partial LMAX values, one for
each record within subject; see **partial** below. Partial LMAX values
are interpreted as effects due to deletion of individual records
rather than deletion of individual subjects. In single-record data,
the partial LMAX values are the LMAX values.

**effects** is for use after **stcox, shared()** and provides estimates of the
log frailty for each group. The log frailties are random
group-specific offsets to the linear predictor that measure the group
effect on the log relative-hazard.

**scores** calculates the efficient score residuals for each regressor in the
model. For multiple-record data, by default only one score per
subject is calculated, and it is placed on the last record for the
subject.

Adding the **partial** option will produce partial efficient score
residuals, one for each record within subject; see **partial** below.
Partial efficient score residuals are the additive contributions to a
subject's overall efficient score residual. In single-record data,
the partial efficient score residuals are the efficient score
residuals.

One efficient score residual variable is created for each regressor
in the model; the first new variable corresponds to the first
regressor, the second to the second, and so on.

**esr** is a synonym for **scores**.

**dfbeta** calculates the DFBETA measures of influence for each regressor in
the model. The DFBETA value for a subject estimates the change in
the regressor's coefficient due to deletion of that subject. For
multiple-record data, by default only one value per subject is
calculated, and it is placed on the last record for the subject.

Adding the **partial** option will produce partial DFBETAs, one for each
record within subject; see **partial** below. Partial DFBETAs are
interpreted as effects due to deletion of individual records rather
than deletion of individual subjects. In single-record data, the
partial DFBETAs are the DFBETAs.

One DFBETA variable is created for each regressor in the model; the
first new variable corresponds to the first regressor, the second to
the second, and so on.

**schoenfeld** calculates the Schoenfeld residuals. This option may not be
used after **stcox** with the **exactm** or **exactp** option. Schoenfeld
residuals are calculated and reported only at failure times.

One Schoenfeld residual variable is created for each regressor in the
model; the first new variable corresponds to the first regressor, the
second to the second, and so on.

**scaledsch** calculates the scaled Schoenfeld residuals. This option may
not be used after **stcox** with the **exactm** or **exactp** option. Scaled
Schoenfeld residuals are calculated and reported only at failure
times.

One scaled Schoenfeld residual variable is created for each regressor
in the model; the first new variable corresponds to the first
regressor, the second to the second, and so on.

Note: The easiest way to use the preceding four options is, for example,

**. predict double** *stub****, scores**

where *stub* is a short name of your choosing. Stata then creates
variables *stub***1**, *stub***2**, etc. You may also specify each variable name
explicitly, in which case there must be as many (and no more)
variables specified as there are regressors in the model.

**nooffset** is allowed only with **hr**, **xb**, and **stdp**, and is relevant only if
you specified **offset(***varname***)** for **stcox**. It modifies the
calculations made by **predict** so that they ignore the offset variable;
the linear prediction is treated as xb rather than xb + offset.

**partial** is relevant only for multiple-record data and is valid with
**mgale**, **csnell**, **deviance**, **ldisplace**, **lmax**, **scores**, **esr**, and **dfbeta**.
Specifying **partial** will produce "partial" versions of these
statistics, where one value is calculated for each record instead of
one for each subject. The subjects are determined by the **id()** option
to **stset**.

Specify **partial** if you wish to perform diagnostics on individual
records rather than on individual subjects. For example, a partial
DFBETA would be interpreted as the effect on a coefficient due to
deletion of one record, rather than the effect due to deletion of all
records for a given subject.

__Predictions after stcox with the tvc() option__

The only **predict** options supported after **stcox** with the **tvc()** option are
the **hr**, **xb**, and **stdp** options. The other predictions require that you
**stsplit** your data to draw out the time-varying covariates inferred by
**tvc()**; see tvc note.

__Predictions after stcox with the shared() option__

All **predict** options described above are supported for shared-frailty
models fit using **stcox** with the **shared()** option. Predictions are
conditional on the estimated frailty variance, theta, and the definition
of baseline is extended to represent covariates equal to 0 and a frailty
value of 1 (log frailty of 0).

__Syntax for margins__

**margins** [*marginlist*] [**,** *options*]

**margins** [*marginlist*] **,** __pr__**edict(***statistic *...**)** [__pr__**edict(***statistic *...**)**
...] [*options*]

*statistic* Description
-------------------------------------------------------------------------
**hr** predicted hazard ratio, also known as the relative
hazard; the default
**xb** linear prediction xb
**stdp** not allowed with **margins**
__bases__**urv** not allowed with **margins**
__basec__**hazard** not allowed with **margins**
**basehc** not allowed with **margins**
__mg__**ale** not allowed with **margins**
__csn__**ell** not allowed with **margins**
__dev__**iance** not allowed with **margins**
__ld__**isplace** not allowed with **margins**
__lm__**ax** not allowed with **margins**
__eff__**ects** not allowed with **margins**
__sco__**res** not allowed with **margins**
**esr** not allowed with **margins**
__dfb__**eta** not allowed with **margins**
__sch__**oenfeld** not allowed with **margins**
__sca__**ledsch** not allowed with **margins**
-------------------------------------------------------------------------

Statistics not allowed with **margins** are functions of stochastic
quantities other than **e(b)**.

For the full syntax, see **[R] margins**.

__Menu for margins__

**Statistics > Postestimation**

__Description for margins__

**margins** estimates margins of response for hazard ratios and linear
predictions.

__Syntax for estat concordance__

**estat** __con__**cordance** [*if*] [*in*] [**,** *concordance_options*]

*concordance_options* Description
-------------------------------------------------------------------------
Main
__h__**arrell** compute Harrell's C coefficient; the default
__gh__**eller** compute Gönen and Heller's concordance coefficient
**se** compute asymptotic standard error of Gönen and
Heller's coefficient
**all** compute statistic for all observations in the data
__nosh__**ow** do not show st setting information
-------------------------------------------------------------------------

__Menu for estat__

**Statistics > Postestimation**

__Description for estat__

**estat concordance** calculates the concordance probability, which is
defined as the probability that predictions and outcomes are concordant.
**estat concordance** provides two measures of the concordance probability:
Harrell's C and Gönen and Heller's K concordance coefficients. **estat**
**concordance** also reports the Somers's D rank correlation, which is
obtained by calculating 2C-1 or 2K-1.

__Options for estat concordance__

+-------+
----+ Main +------------------------------------------------------------

**harrell**, the default, calculates Harrell's C coefficient, which is
defined as the proportion of all usable subject pairs in which the
predictions and outcomes are concordant.

**gheller** calculates Gönen and Heller's K concordance coefficient instead
of Harrell's C coefficient. The **harrell** and **gheller** options may be
specified together to obtain both concordance measures.

**se** calculates the smoothed version of Gönen and Heller's K concordance
coefficient and its asymptotic standard error. The **se** option
requires the **gheller** option.

**all** requests that the statistic be computed for all observations in the
data. By default, **estat concordance** computes over the estimation
subsample.

**noshow** prevents **estat concordance** from displaying the identities of the
key st variables above its output.

__Examples__

Setup
**. webuse drugtr**

Declare data to be survival-time data
**. stset studytime, failure(died)**

Fit Cox model
**. stcox drug age**

Obtain martingale residuals
**. predict double mg, mgale**

Obtain Cox-Snell residuals
**. predict double cs, csnell**

Obtain deviance residuals
**. predict double dev, deviance**

Calculate Harrell's C
**. estat concordance**

Calculate Gönen and Heller's concordance coefficient
**. estat concordance, gheller**

__Stored results__

**estat concordance** stores the following in **r()**:

Scalars
**r(N)** number of observations
**r(n_P)** number of comparison pairs
**r(n_E)** number of orderings as expected
**r(n_T)** number of tied predictions
**r(C)** Harrell's C coefficient
**r(K)** Gönen and Heller's K coefficient
**r(K_s)** smoothed Gönen and Heller's K coefficient
**r(K_s_se)** standard error of the smoothed K coefficient
**r(D)** Somers's D coefficient for Harrell's C
**r(D_K)** Somers's D coefficient for Gönen and Heller's K

**r(n_P)**, **r(n_E)**, and **r(n_T)** are returned only when strata are not
specified.