{smcl}
{* 10oct2001}{...}
{hline}
help for {hi:streg2}
{hline}

{title:Estimate parametric survival models}

{p 8 16}{cmd:streg2}{space 3}[{it:varlist}]
[{cmd:if} {it:exp}] [{cmd:in} {it:range}]
[{cmd:,} {cmdab:d:ist(}{it:distname}{cmd:)} {cmd:nohr} {cmdab:ti:me}
{cmd:tr} {cmdab:l:evel(}{it:#}{cmd:)} {cmdab:r:obust}
{cmdab:cl:uster(}{it:varname}{cmd:)} {cmdab:sc:ore(}{it:newvar(s)}{cmd:)}
{cmdab:anc:illary(}{it:varname}{cmd:)} {cmd:anc2(}{it:varname}{cmd:)}
{cmdab:st:rata(}{it:varname}{cmd:)}
{cmdab:fr:ailty(}{cmdab:g:amma} | {cmdab:i:nvgaussian)}
{cmdab:sh:ared(}{it:varname}{cmd:)}
{cmdab:nosh:ow} {cmdab:const:raints(}{it:numlist}{cmd:)}
{cmd:cmd} {cmdab:nolo:g} {it:maximize_options} ]

{p}where {it:distname} is one of

{p 8 8}{c -(} {cmdab:e:xponential} | {cmdab:w:eibull} | {cmdab:gom:pertz} |
{cmdab:logn:ormal} | {cmdab:ln:ormal} | {cmdab:logl:ogistic} |
{cmdab:ll:ogistic} | {cmdab:gam:ma} {c )-}

{p}{cmd:lognormal} and {cmd:lnormal} are synonyms; {cmd:loglogistic} and
{cmd:llogistic} are synonyms.


{p}{cmd:streg2} is for use with survival-time data; see help {help st}.  You
must {cmd:stset} your data before using {cmd:streg2}; see help {help stset}.

{p}{cmd:by} {it:...} {cmd::} may be used with {cmd:streg2}; see help {help by}.

{p}{cmd:stcurve} may be used after {cmd:streg2}; see help {help stcurve}.

{p}{cmd:streg2} shares the features of all estimation commands; see help
{help est}.


{p}The syntax of {help predict} following {cmd:streg2} is

{p 8 16}{cmd:predict} [{it:type}] {it:newvarname} [{cmd:if} {it:exp}]
          [{cmd:in} {it:range}] [{cmd:,} {it:statistic}]

{p}where {it:statistic} is

{p 8 28}{cmdab:med:ian} {cmd:time} {space 3} predicted median survival time;
	the default{p_end}
{p 8 28}{cmdab:med:ian} {cmdab:lnt:ime} {space 1} predicted median
	ln(survival time){p_end}
{p 8 28}{cmd:mean time} {space 5} predicted mean survival time{p_end}
{p 8 28}{cmd:mean} {cmdab:lnt:ime} {space 3} predicted mean
	ln(survival time){p_end}
{p 8 28}{cmd:hazard} {space 8} predicted hazard{p_end}
{p 8 28}{cmd:hr} {space 12} predicted hazard ratio{p_end}
{p 8 28}{cmd:xb} {space 12} linear prediction{p_end}
{p 8 28}{cmd:stdp} {space 10} standard error of the linear prediction{p_end}
{p 8 28}{cmdab:s:urv} {space 10} predicted S(depvar) or S(depvar|t0){p_end}
{p 8 28}{cmdab:csn:ell} {space 8} (partial) Cox-Snell residuals{p_end}
{p 8 28}{cmdab:mg:ale} {space 9} (partial) martingale-like residuals{p_end}
{p 8 28}{cmdab:dev:iance} {space 6} deviance residuals{p_end}
{p 8 28}{cmdab:cs:urv} {space 9} predicted S(depvar|earliest t0 for
	subject){p_end}
{p 8 28}{cmdab:ccs:nell} {space 7} cumulative Cox-Snell residuals{p_end}
{p 8 28}{cmdab:cmg:ale} {space 8} cumulative martingale-like residuals

{p}These statistics are available both in and out of sample; type
"{cmd:predict} {it:...} {cmd:if e(sample)} {it:...}" if wanted only for the
estimation sample.

{p}When no option is specified, the predicted median survival time is
calculated for all models.  The predicted hazard ratio option {cmd:hr} is only
available for the exponential, Weibull, and Gompertz models.  The
{cmd:mean time} and {cmd:mean lntime} options are not available for the
Gompertz model, and the {cmd:mean time} option is not available for the
generalized log-gamma model.  The {cmd:mean time} and {cmd:mean lntime}
options are not available if {cmd:frailty()} is specified.


{title:Description}

{p}{cmd:streg2} is a version of the Stata command {cmd:streg} which fits
shared frailty models.  The syntax is identical to that of {cmd:streg} and 
thus {cmd:streg2} may be used as a substitute for {cmd:streg}.  Those wishing
to fit shared frailty models may do so by adding the option {cmd:shared(}
{it:varname}{cmd:)} where {it:varname} is the variable which defines 
the grouping over which frailties are shared.  {cmd:shared()} may only 
be used in conjunction with {cmd:frailty()}.

{p}{cmd:streg2} performs maximum likelihood estimation of parametric regression
survival-time models.  Survival models currently supported are exponential,
Weibull, Gompertz, lognormal, log-logistic and generalized gamma.  Also see
help {help stcox} for estimation of proportional hazards models.

{p}{cmd:stcurve} is used after {cmd:streg2} to plot the cumulative hazard,
survival, and hazard functions at the mean value of the covariates or at
values specified by the {cmd:at()} options.


{title:Options for {cmd:streg2}}

{p 0 4}{cmd:dist(}{it:distname}{cmd:)} specifies the survival model to be
estimated.  If {cmd:dist(}{it:distname}{cmd:)} is not specified, {cmd:streg2}
will use the same distribution as the previous time {cmd:streg2} was used or,
if there is no previous time, it will issue an error message.

{p 4 4}If {cmd:frailty()} is specified, then {cmd:dist()} and {cmd:frailty()}
will carry over to each subsequent estimation if neither is specified then.
However, if in a subsequent estimation {cmd:dist()} is specified without
{cmd:frailty()}, then a model without frailty is estimated.

{p 0 4}{cmd:nohr} specifies that coefficients rather than exponentiated
coefficients are to be displayed or, said differently, coefficients rather
than hazard ratios are displayed.  This option is valid only for models with a
proportional hazard ratio parameterization: exponential, Weibull, and
Gompertz.

{p 4 4}{cmd:nohr}, which can be specified when the model is estimated or when
redisplaying results, states that the underlying log relative hazard
coefficients are to be displayed.  This option affects only how results are
displayed, not how they are estimated.

{p 0 4}{cmd:time} specifies that the model is to be estimated in the
accelerated failure-time metric rather than in the log relative-hazard metric.
This option is only valid for the exponential and Weibull models since they
have both a hazard ratio and an accelerated failure-time parameterization.
For these two models, in the log relative-hazard metric, estimates of (B,s)
are produced and in the accelerated failure-time metric, estimates of
(-B*s,s) are produced.

{p 4 4}Regardless of metric, the likelihood function is the same and models
are equally appropriate viewed in either metric; it is just a matter of
changing interpretation.

{p 4 4}{cmd:time} must be specified when the model is estimated.

{p 0 4}{cmd:tr} is appropriate only for the log-logistic, lognormal, and gamma
models, or for the exponential and Weibull models when estimated in log
expected time.  {cmd:tr} specifies that exponentiated coefficients are to be
displayed, which have the interpretation of time ratios.

{p 4 4}{cmd:tr} may be specified when the model is estimated or when results
are redisplayed.

{p 0 4}{cmd:level(}{it:#}{cmd:)} specifies the confidence level, in percent,
for confidence intervals.  The default is {cmd:level(95)} or as set by
{cmd:set level}; see help {help level}.

{p 0 4}{cmd:robust} specifies that the robust method of calculating the
variance-covariance matrix is to be used instead of the traditional
calculation.  If you specify {cmd:robust}, and if you have previously
{cmd:stset} an {cmd:id()} variable, the {cmd:robust} calculation will be
clustered on the {cmd:id()} variable.

{p 4 4}We especially recommend that you specify {cmd:robust} if you have
{cmd:stset} an {cmd:id()} variable because the assumption that justifies the
conventional variance estimate -- the independence of the observations --
is presumably false.

{p 0 4}{cmd:cluster(}{it:varname}{cmd:)} implies {cmd:robust} and specifies a
variable on which clustering is to be based.  This overrides the default
clustering, if any.

{p 0 4}{cmd:score(}{it:newvar(s)}{cmd:)} creates {it:newvar(s)} containing the
contribution to the scores.  One new variable is specified in the case of an
exponential model, two variables are specified for Weibull, lognormal,
Gompertz, and log-logistic models, and three new variables are specified in
the case of gamma.

{p 4 4}The first new variable will contain d(ln L_j)/d(x_jb).

{p 4 4}The second and third new variables, if they exist, will contain d(ln
L_j) with respect to the second and third ancillary parameters.

{p 0 4}{cmd:ancillary(}{it:varlist}{cmd:)} specifies that the ancillary
parameter for the Weibull, lognormal, Gompertz, and log-logistic distributions
and the first ancillary parameter (sigma) of the generalized log-gamma
distribution are to be estimated as a linear combination of {it:varlist}.
This option is not available if {cmd:frailty()} is specified.

{p 0 4}{cmd:anc2(}{it:varlist}{cmd:)} specifies that the second ancillary
parameter (kappa) for the generalized log-gamma distribution is to be estimated
as a linear combination of {it:varlist}.  This option is not available if
{cmd:frailty()} is specified.

{p 0 4}{cmd:strata(}{it:varname}{cmd:)} specifies a stratification variable.
Observations with equal values of the variable are assumed to be in the same
stratum.  Stratified estimates (equal coefficients across strata but intercepts
and ancillary parameters unique to each stratum) are then estimated.  This
option is not available if {cmd:frailty()} is specified.

{p 0 4}{cmd:frailty(gamma} | {cmd:invgaussian)} specifies the assumed
distribution of the observation level frailty or heterogeneity.  The estimated
model will, in addition to the standard parameter estimates, produce an
estimate of the variance of the frailties and a likelihood-ratio test of the
null hypothesis that this variance is zero.  When this null hypothesis is
true, the model reduces to the model with {cmd:frailty()} not specified.

{p 0 4}{cmd:shared(}{it:varname}{cmd:)} specifies a shared frailty variable.
Observations with equal values of the variable are assumed to be in the same
group and thus assumed to share the same frailty.  {cmd:shared()} is not 
allowed with {cmd:dist(gamma)}, that is, when fitting a gamma regression 
model, but is allowed (of course) with {cmd:frailty(gamma)}, that is, when 
assuming a gamma distribution for the frailty.  If {cmd:frailty()} is 
specified without {cmd:shared()} then the frailties are assumed to be at 
the subject level rather than shared between subjects.

{p 0 4}{cmd:noshow} prevents {cmd:streg2} from displaying the identities of the
key st variables above its output.  If this appeals to you, consider typing
"{cmd:stset, noshow}" to make {cmd:noshow} the default for all st commands;
see help {help stset}.

{p 0 4}{cmd:constraints(}{it:numlist}{cmd:)} specifies the linear constraints
to be applied during estimation.  Constraints are defined using the
{cmd:constraint} command and are numbered; see help {help constraint}.  The
default is to perform unconstrained estimation.

{p 0 4}{cmd:cmd} displays the underlying command that {cmd:streg2} would
execute but it does not estimate the model.

{p 0 4}{cmd:nolog} prevents {cmd:streg2} from showing the iteration log.

{p 0 4}{it:maximize_options} control the maximization process; see help
{help maximize}.  You should never have to specify them.


{title:Options for {cmd:stcurve}}

{p 0 4}{cmd:cumhaz} requests that the cumulative hazard function be plotted.

{p 0 4}{cmd:survival} requests that the survival function be plotted.

{p 0 4}{cmd:hazard} requests that the hazard function be plotted.

{p 0 4}{cmd:range(}{it:starttime endtime}{cmd:)} specifies the range of the
time axis to be plotted.  If this option is not specified, {cmd:stcurve} will
plot the desired curve on an interval expanding from the earliest to the
latest time in the data.

{p 0 4}{cmd:at(}{it:varname}{cmd:=}{it:# ...}{cmd:)} requests that the
covariates specified by {it:varname} be set to the value of {it:#}. By default
{cmd:stcurve} evaluates the function by setting each covariate to its mean
value.  This option causes the function to be evaluated at the value of the
covariates listed in {cmd:at()} and at the mean of all non-listed covariates.

{p 0 4}{cmd:at1(}{it:varname}{cmd:=}{it:# ...}{cmd:)},
{cmd:at2(}{it:varname}{cmd:=}{it:# ...}{cmd:)}, ...,
{cmd:at10(}{it:varname}{cmd:=}{it:# ...}{cmd:)} specify that multiple curves
(up to 10) are to be plotted on the same graph.  {cmd:at1()}, {cmd:at2()}, ...,
{cmd:at10()} work like the {cmd:at()} option:  the option causes the function
to be evaluated at the value of the covariates specified and at the mean of
all unlisted covariates.  {cmd:at1()} specifies the values of the covariates
for the first curve, {cmd:at2()} specifies the values of the covariates for
the second curve, and so on.

{p 0 4}{cmd:outfile(}{it:filename} [{cmd:,} {cmd:replace}]{cmd:)} saves in
{it:filename} the values used to plot the curve(s).

{p 0 4}{it:graph_options} are most of the options allowed with
{cmd:graph, twoway}; see help {help grtwoway}.


{title:Options for {help predict}}

{p 0 4}{cmd:median time}, the default, calculates the predicted median survival
time in analysis time units.  Note that this is the prediction from time 0
conditional on constant covariates.

{p 0 4}{cmd:median lntime} calculates the natural logarithm of what
{cmd:median time} produces.

{p 0 4}{cmd:mean time} calculates the predicted mean survival time in analysis
time units.  Note that this is the prediction from time 0 conditional on
constant covariates.  This option is neither available for the Gompertz and
gamma regressions, nor when {cmd:frailty()} is used.

{p 0 4}{cmd:mean lntime} calculates the mean of the natural logarithm of time.
This option is neither available for Gompertz regression, nor when
{cmd:frailty()} is used.

{p 0 4}{cmd:hazard} calculates the predicted hazard.

{p 0 4}{cmd:hr} calculates the predicted hazard ratio (excludes estimated
intercept).  This option is only available for the exponential, Weibull, and
Gompertz models.

{p 0 4}{cmd:xb} calculates the linear prediction.

{p 0 4}{cmd:stdp} calculates the standard error of the linear prediction.

{p 0 4}{cmd:surv} calculates the predicted S(t|t0).  If you did not specify
t0() when you estimated the model, t0=0 and thus {cmd:surv} calculates the
predicted survivor function at the time of failure or censoring, S(t).
Otherwise, it is the probability of surviving through t given survival through
t0.

{p 0 4}{cmd:csnell} calculates the (partial) Cox-Snell residual.  If you have
single observations per subject, then {cmd:csnell} calculates the usual
Cox-Snell residual.  Otherwise, {cmd:csnell} calculates the additive
contribution of this observation to the subject's overall Cox-Snell residual.

{p 0 4}{cmd:mgale} calculates the (partial) martingale-like residual.  The
issues are are the same as with {cmd:csnell} above.

{p 0 4}{cmd:deviance} calculates the deviance residual.  In the case of
multiple-record data, only one value per subject is calculated and it is
placed on the last record for the subject.

{p 0 4}{cmd:csurv} calculates the predicted S(t|earliest t0) for each subject
in multiple-record data.  This is based on calculating the conditional
survivor values S(t|t0) (see option {cmd:csurv} above) and then multiplying
them together.

{p 0 4}{cmd:ccsnell} calculates the (cumulative) Cox-Snell residual in
multiple-record data.  This is based on calculating the partial Cox-Snell
residuals (see option {cmd:csnell} above) and then summing them.  Only one
value per subject is recorded -- the overall sum -- and it is placed on the
last record for the subject.

{p 0 4}{cmd:cmgale} calculates the (cumulative) martingale-like residual in
multiple-record data.  This is based on calculating the partial
martingale-like residuals (see option {cmd:mgale} above) and then summing them.
Only one value per subject is recorded -- the overall sum -- and it is placed
on the last record for the subject.


{title:Examples}

{p 8 12}{inp:. stset failtime, failure(died) id(serno)}{p_end}
{p 8 12}{inp:. streg2 load bearings, dist(weibull)}{p_end}
{p 8 12}{inp:. streg2 load bearings, dist(lognormal) robust}{p_end}
{p 8 12}{inp:. streg2 load bearings, dist(weibull) frailty(gamma) shared(batch)}{p_end}
{p 8 12}{inp:. streg2, dist(gamma)}

{p 8 12}{inp:. stcurve, cumhaz}{p_end}
{p 8 12}{inp:. stcurve, cumhaz at(drug=1)}{p_end}
{p 8 12}{inp:. stcurve, cumhaz at(drug=1, age=40)}{p_end}
{p 8 12}{inp:. stcurve, cumhaz at(drug=1, age=40) range(0,50)}


{title:Also see}

{p 1 14}Manual:  {hi:[U] 23 Estimation and post-estimation commands},{p_end}
{p 10 14}{hi:[U] 29 Overview of model estimation in Stata},{p_end}
	  {hi:[R] st streg}
{p 0 19}On-line:  help for {help constraint}, {help est}, {help postest};
{help st}, {help sts}, {help stset}, {help stcox}{p_end}