**[ST] streg** -- Parametric survival models

__Syntax__

**streg** [*indepvars*] [*if*] [*in*] [**,** *options*]

*options* Description
-------------------------------------------------------------------------
Model
__nocons__**tant** suppress constant term
__dist__**ribution(**__e__**xponential)** exponential survival distribution
__dist__**ribution(**__gom__**pertz)** Gompertz survival distribution
__dist__**ribution(**__logl__**ogistic)** loglogistic survival distribution
__dist__**ribution(**__ll__**ogistic)** synonym for **distribution(loglogistic)**
__dist__**ribution(**__w__**eibull)** Weibull survival distribution
__dist__**ribution(**__logn__**ormal)** lognormal survival distribution
__dist__**ribution(**__ln__**ormal)** synonym for **distribution(lognormal)**
__dist__**ribution(**__ggam__**ma)** generalized gamma survival distribution
__fr__**ailty(**__g__**amma)** gamma frailty distribution
__fr__**ailty(**__i__**nvgaussian)** inverse-Gaussian distribution
**time** use accelerated failure-time metric

Model 2
__st__**rata(***varname***)** strata ID variable
__off__**set(***varname***)** include *varname* in model with coefficient
constrained to 1
__sh__**ared(***varname***)** shared frailty ID variable
__anc__**illary(***varlist***)** use *varlist* to model the first ancillary
parameter
**anc2(***varlist***)** use *varlist* to model the second ancillary
parameter
__const__**raints(***constraints***)** apply specified linear constraints
__col__**linear** keep collinear variables

SE/Robust
**vce(***vcetype***)** *vcetype* may be **oim**, __r__**obust**, __cl__**uster**
*clustvar*, **opg**, __boot__**strap**, or __jack__**knife**

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
**nohr** do not report hazard ratios
__tr__**atio** report time ratios
__nos__**how** do not show st setting information
__nohead__**er** suppress header from coefficient table
__nolr__**test** do not perform likelihood-ratio test
__nocnsr__**eport** do not display constraints
*display_options* control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling

Maximization
*maximize_options* control the maximization process; seldom
used

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------
You must **stset** your data before using **streg**; see **[ST] stset**.
*varlist* may contain factor variables; see fvvarlist.
**bayes**, **bootstrap**, **by**, **fmm**, **fp**, **jackknife**, **mfp**, **mi estimate**, **nestreg**,
**statsby**, **stepwise**, and **svy** are allowed; see prefix. For more details,
see **[BAYES] bayes: streg** and **[FMM] fmm: streg**.
**vce(bootstrap)** and **vce(jackknife)** are not allowed with the **mi estimate**
prefix.
**shared()**, **vce()**, and **noheader** are not allowed with the **svy** prefix.
**fweight**s, **iweight**s, and **pweight**s may be specified using **stset**; see **[ST]**
**stset**. However, weights may not be specified if you are using the
**bootstrap** prefix with the **streg** command.
**coeflegend** does not appear in the dialog box.
See **[ST] streg postestimation** for features available after estimation.

__Menu__

**Statistics > Survival analysis > Regression models >** **Parametric survival**
**models**

__Description__

**streg** performs maximum likelihood estimation for parametric regression
survival-time models. **streg** can be used with single- or multiple-record
or single- or multiple-failure st data. Survival models currently
supported are exponential, Weibull, Gompertz, lognormal, loglogistic, and
generalized gamma. Parametric frailty models and shared-frailty models
are also fit using **streg**.

Also see **[ST] stcox** for proportional hazards models.

__Options__

+-------+
----+ Model +------------------------------------------------------------

**noconstant**; see **[R] estimation options**.

**distribution(***distname***)** specifies the survival model to be fit. A
specified **distribution()** is remembered from one estimation to the
next when **distribution()** is not specified.

For instance, typing **streg** **x1** **x2,** **distribution(weibull)** fits a
Weibull model. Subsequently, you do not need to specify
**distribution(weibull)** to fit other Weibull regression models.

All Stata estimation commands, including **streg**, redisplay results
when you type the command name without arguments. To fit a model
with no explanatory variables, type **streg,** **distribution(***distname***)**
....

**frailty(gamma** | **invgaussian)** specifies the assumed distribution of the
frailty, or heterogeneity. The estimation results, in addition to
the standard parameter estimates, will contain an estimate of the
variance of the frailties and a likelihood-ratio test of the null
hypothesis that this variance is zero. When this null hypothesis is
true, the model reduces to the model with **frailty(***distname***)** not
specified.

A specified **frailty()** is remembered from one estimation to the next
when **distribution()** is not specified. When you specify
**distribution()**, the previously remembered specification of **frailty()**
is forgotten.

**time** specifies that the model be fit in the accelerated failure-time
metric rather than in the log relative-hazard metric. This option is
valid only for the exponential and Weibull models because these are
the only models that have both a proportional hazards and an
accelerated failure-time parameterization. Regardless of metric, the
likelihood function is the same, and models are equally appropriate
viewed in either metric; it is just a matter of changing
interpretation.

**time** must be specified at estimation.

+---------+
----+ Model 2 +----------------------------------------------------------

**strata(***varname***)** specifies the stratification ID variable. Observations
with equal values of the variable are assumed to be in the same
stratum. Stratified estimates (with equal coefficients across strata
but intercepts and ancillary parameters unique to each stratum) are
then obtained. This option is not available if **frailty(***distname***)** is
specified.

**offset(***varname***)**; see **[R] estimation options**.

**shared(***varname***)** is valid with **frailty()** and specifies a variable defining
those groups over which the frailty is shared, analogous to a
random-effects model for panel data where *varname* defines the panels.
**frailty()** specified without **shared()** treats the frailties as
occurring at the observation level.

A specified **shared()** is remembered from one estimation to the next
when **distribution()** is not specified. When you specify
**distribution()**, the previously remembered specification of **shared()**
is forgotten.

**shared()** may not be used with **distribution(ggamma)**, **vce(robust)**,
**vce(cluster** *clustvar***)**, **vce(opg)**, the **svy** prefix, or in the presence
of delayed entries or gaps.

If **shared()** is specified without **frailty()** and there is no remembered
**frailty()** from the previous estimation, **frailty(gamma)** is assumed to
provide behavior analogous to **stcox**; see **[ST] stcox**.

**ancillary(***varlist***)** specifies that the ancillary parameter for the
Weibull, lognormal, Gompertz, and loglogistic distributions and that
the first ancillary parameter (sigma) of the generalized log-gamma
distribution be estimated as a linear combination of *varlist*. This
option may not be used with **frailty(***distname***)**.

When an ancillary parameter is constrained to be strictly positive,
the logarithm of the ancillary parameter is modeled as a linear
combination of *varlist*.

**anc2(***varlist***)** specifies that the second ancillary parameter (kappa) for
the generalized log-gamma distribution be estimated as a linear
combination of *varlist*. This option may not be used with
**frailty(***distname***)**.

**constraints(***constraints***)**, **collinear**; see **[R] estimation options**.

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype***)** specifies the type of standard error reported, which
includes types that are derived from asymptotic theory (**oim**, **opg**),
that are robust to some kinds of misspecification (**robust**), that
allow for intragroup correlation (**cluster** *clustvar*), and that use
bootstrap or jackknife methods (**bootstrap**, **jackknife**); see **[R]**
*vce_option*.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

**nohr**, which may be specified at estimation or upon redisplaying results,
specifies that coefficients rather than exponentiated coefficients be
displayed, that is, that coefficients rather than hazard ratios be
displayed. This option affects only how coefficients are displayed,
not how they are estimated.

This option is valid only for models with a natural
proportional-hazards parameterization: exponential, Weibull, and
Gompertz. These three models, by default, report hazards ratios
(exponentiated coefficients).

**tratio** specifies that exponentiated coefficients, which are interpreted
as time ratios, be displayed. **tratio** is appropriate only for the
loglogistic, lognormal, and generalized gamma models, or for the
exponential and Weibull models when fit in the accelerated
failure-time metric.

**tratio** may be specified at estimation or upon replay.

**noshow** prevents **streg** from showing the key st variables. This option is
rarely used because most people type **stset, show** or **stset, noshow** to
set once and for all whether they want to see these variables
mentioned at the top of the output of every st command; see **[ST]**
**stset**.

**noheader** suppresses the output header, either at estimation or upon
replay.

**nolrtest** is valid only with frailty models, in which case it suppresses
the likelihood-ratio test for significant frailty.

**nocnsreport**; see **[R] estimation options**.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

+--------------+
----+ Maximization +-----------------------------------------------------
*maximize_options*: __dif__**ficult**, __tech__**nique(***algorithm_spec***)**, __iter__**ate(***#***)**,
[__no__]__lo__**g**, __tr__**ace**, __grad__**ient**, **showstep**, __hess__**ian**, __showtol__**erance**,
__tol__**erance(***#***)**, __ltol__**erance(***#***)**, __nrtol__**erance(***#***)**, __nonrtol__**erance**, and
**from(***init_specs***)**; see **[R] maximize**. These options are seldom used.

Setting the optimization type to **technique(bhhh)** resets the default
*vcetype* to **vce(opg)**.

The following option is available with **streg** but is not shown in the
dialog box:

**coeflegend**; see **[R] estimation options**.

__Examples__

---------------------------------------------------------------------------
Setup
**. webuse kva**

Fit a Weibull survival model
**. streg load bearings, distribution(weibull)**

Replay results, but display coefficients rather than hazard ratios
**. streg, nohr**

Fit a Weibull survival model in the accelerated failure-time metric
**. streg load bearings, distribution(weibull) time**

---------------------------------------------------------------------------
Setup
**. webuse mfail**

Fit a Weibull survival model using data that has multiple failures per
subject, and specify robust standard errors
**. streg x1 x2, distribution(weibull) vce(robust)**

Same as above, but fit exponential model rather than Weibull
**. streg x1 x2, distribution(exp) vce(robust)**

---------------------------------------------------------------------------
Setup
**. webuse cancer**

Map values for **drug** into 0 for placebo and 1 for nonplacebo
**. replace drug = drug == 2 | drug == 3**

Declare data to be survival-time data
**. stset studytime, failure(died)**

Fit a generalized gamma survival model
**. streg drug age, distribution(ggamma)**

Test for appropriateness of Weibull model
**. test [/kappa] = 1**

---------------------------------------------------------------------------
Setup
**. webuse hip3, clear**

Fit a Weibull survival model, using **male** to model the ancillary parameter
**. streg protect age, dist(weibull) ancillary(male)**

---------------------------------------------------------------------------
Setup
**. webuse cancer**

Declare data to be survival-time data
**. stset studytime died**

Fit a stratified Weibull survival model
**. streg age, dist(weibull) strata(drug)**

Produce a "less-stratified" model than above
**. streg age, dist(weibull) ancillary(i.drug)**

---------------------------------------------------------------------------
Setup
**. webuse bc**

List some of the data
**. list in 1/12**

Declare data to be survival-time data
**. stset t, fail(dead)**

Fit Weibull survival model with gamma-distributed frailty
**. streg age smoking, dist(weibull) frailty(gamma)**

Fit Weibull survival model with inverse-Gaussian-distributed frailty
**. streg age smoking, dist(weibull) frailty(invgauss)**

---------------------------------------------------------------------------
Setup
**. webuse catheter**

List some of the data
**. list in 1/10**

Declare data to be survival-time data
**. stset time, fail(infect)**

Fit Weibull survival model with inverse-Gaussian-distributed shared
frailty
**. streg age female, dist(weibull) frailty(invgauss) shared(patient)**

Same as above, but fit lognormal model rather than Weibull
**. streg age female, dist(lnormal) frailty(invgauss) shared(patient)**

---------------------------------------------------------------------------
Setup
**. webuse nhefs**

Declare survey design for data
**. svyset psu2 [pw=swgt2], strata(strata2)**

Declare data to be survival-time data
**. stset age_lung_cancer if age_lung_cancer < . [pw=swgt2],**
**fail(lung_cancer)**

Fit exponential survival model taking into account data are survey data
**. svy: streg former_smoker smoker male urban1 rural, dist(exp)**
---------------------------------------------------------------------------

__Stored results__

**streg** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(N_sub)** number of subjects
**e(N_fail)** number of failures
**e(N_g)** number of groups
**e(k)** number of parameters
**e(k_eq)** number of equations in **e(b)**
**e(k_eq_model)** number of equations in overall model test
**e(k_aux)** number of auxiliary parameters
**e(k_dv)** number of dependent variables
**e(df_m)** model degrees of freedom
**e(ll)** log likelihood
**e(ll_0)** log likelihood, constant-only model
**e(ll_c)** log likelihood, comparison model
**e(N_clust)** number of clusters
**e(chi2)** chi-squared
**e(chi2_c)** chi-squared, comparison model
**e(risk)** total time at risk
**e(g_min)** smallest group size
**e(g_avg)** average group size
**e(g_max)** largest group size
**e(theta)** frailty parameter
**e(aux_p)** ancillary parameter (**weibull**)
**e(gamma)** ancillary parameter (**gompertz, loglogistic**)
**e(sigma)** ancillary parameter (**ggamma, lnormal**)
**e(kappa)** ancillary parameter (**ggamma**)
**e(p)** p-value for model test
**e(p_c)** p-value for comparison test
**e(rank)** rank of **e(V)**
**e(rank0)** rank of **e(V)**, constant-only model
**e(ic)** number of iterations
**e(rc)** return code
**e(converged)** **1** if converged, **0** otherwise

Macros
**e(cmd)** model or regression name
**e(cmd2)** **streg**
**e(cmdline)** command as typed
**e(dead)** **_d**
**e(depvar)** **_t**
**e(strata)** stratum variable
**e(title)** title in estimation output
**e(clustvar)** name of cluster variable
**e(shared)** frailty grouping variable
**e(fr_title)** title in output identifying frailty
**e(wtype)** weight type
**e(wexp)** weight expression
**e(t0)** **_t0**
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(frm2)** **hazard** or **time**
**e(chi2type)** **Wald** or **LR**; type of model chi-squared test
**e(offset1)** offset for main equation
**e(stcurve)** **stcurve**
**e(opt)** type of optimization
**e(which)** **max** or **min**; whether optimizer is to perform
maximization or minimization
**e(ml_method)** type of **ml** method
**e(user)** name of likelihood-evaluator program
**e(technique)** maximization technique
**e(properties)** **b V**
**e(predict)** program used to implement **predict**
**e(predict_sub)** **predict** subprogram
**e(footnote)** program used to implement the footnote display
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(Cns)** constraints matrix
**e(ilog)** iteration log (up to 20 iterations)
**e(gradient)** gradient vector
**e(V)** variance-covariance matrix of the estimators
**e(V_modelbased)** model-based variance

Functions
**e(sample)** marks estimation sample