{smcl} {* 10oct2001}{...} {hline} help for {hi:streg2} {hline} {title:Estimate parametric survival models} {p 8 16}{cmd:streg2}{space 3}[{it:varlist}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [{cmd:,} {cmdab:d:ist(}{it:distname}{cmd:)} {cmd:nohr} {cmdab:ti:me} {cmd:tr} {cmdab:l:evel(}{it:#}{cmd:)} {cmdab:r:obust} {cmdab:cl:uster(}{it:varname}{cmd:)} {cmdab:sc:ore(}{it:newvar(s)}{cmd:)} {cmdab:anc:illary(}{it:varname}{cmd:)} {cmd:anc2(}{it:varname}{cmd:)} {cmdab:st:rata(}{it:varname}{cmd:)} {cmdab:fr:ailty(}{cmdab:g:amma} | {cmdab:i:nvgaussian)} {cmdab:sh:ared(}{it:varname}{cmd:)} {cmdab:nosh:ow} {cmdab:const:raints(}{it:numlist}{cmd:)} {cmd:cmd} {cmdab:nolo:g} {it:maximize_options} ] {p}where {it:distname} is one of {p 8 8}{c -(} {cmdab:e:xponential} | {cmdab:w:eibull} | {cmdab:gom:pertz} | {cmdab:logn:ormal} | {cmdab:ln:ormal} | {cmdab:logl:ogistic} | {cmdab:ll:ogistic} | {cmdab:gam:ma} {c )-} {p}{cmd:lognormal} and {cmd:lnormal} are synonyms; {cmd:loglogistic} and {cmd:llogistic} are synonyms. {p}{cmd:streg2} is for use with survival-time data; see help {help st}. You must {cmd:stset} your data before using {cmd:streg2}; see help {help stset}. {p}{cmd:by} {it:...} {cmd::} may be used with {cmd:streg2}; see help {help by}. {p}{cmd:stcurve} may be used after {cmd:streg2}; see help {help stcurve}. {p}{cmd:streg2} shares the features of all estimation commands; see help {help est}. {p}The syntax of {help predict} following {cmd:streg2} is {p 8 16}{cmd:predict} [{it:type}] {it:newvarname} [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [{cmd:,} {it:statistic}] {p}where {it:statistic} is {p 8 28}{cmdab:med:ian} {cmd:time} {space 3} predicted median survival time; the default{p_end} {p 8 28}{cmdab:med:ian} {cmdab:lnt:ime} {space 1} predicted median ln(survival time){p_end} {p 8 28}{cmd:mean time} {space 5} predicted mean survival time{p_end} {p 8 28}{cmd:mean} {cmdab:lnt:ime} {space 3} predicted mean ln(survival time){p_end} {p 8 28}{cmd:hazard} {space 8} predicted hazard{p_end} {p 8 28}{cmd:hr} {space 12} predicted hazard ratio{p_end} {p 8 28}{cmd:xb} {space 12} linear prediction{p_end} {p 8 28}{cmd:stdp} {space 10} standard error of the linear prediction{p_end} {p 8 28}{cmdab:s:urv} {space 10} predicted S(depvar) or S(depvar|t0){p_end} {p 8 28}{cmdab:csn:ell} {space 8} (partial) Cox-Snell residuals{p_end} {p 8 28}{cmdab:mg:ale} {space 9} (partial) martingale-like residuals{p_end} {p 8 28}{cmdab:dev:iance} {space 6} deviance residuals{p_end} {p 8 28}{cmdab:cs:urv} {space 9} predicted S(depvar|earliest t0 for subject){p_end} {p 8 28}{cmdab:ccs:nell} {space 7} cumulative Cox-Snell residuals{p_end} {p 8 28}{cmdab:cmg:ale} {space 8} cumulative martingale-like residuals {p}These statistics are available both in and out of sample; type "{cmd:predict} {it:...} {cmd:if e(sample)} {it:...}" if wanted only for the estimation sample. {p}When no option is specified, the predicted median survival time is calculated for all models. The predicted hazard ratio option {cmd:hr} is only available for the exponential, Weibull, and Gompertz models. The {cmd:mean time} and {cmd:mean lntime} options are not available for the Gompertz model, and the {cmd:mean time} option is not available for the generalized log-gamma model. The {cmd:mean time} and {cmd:mean lntime} options are not available if {cmd:frailty()} is specified. {title:Description} {p}{cmd:streg2} is a version of the Stata command {cmd:streg} which fits shared frailty models. The syntax is identical to that of {cmd:streg} and thus {cmd:streg2} may be used as a substitute for {cmd:streg}. Those wishing to fit shared frailty models may do so by adding the option {cmd:shared(} {it:varname}{cmd:)} where {it:varname} is the variable which defines the grouping over which frailties are shared. {cmd:shared()} may only be used in conjunction with {cmd:frailty()}. {p}{cmd:streg2} performs maximum likelihood estimation of parametric regression survival-time models. Survival models currently supported are exponential, Weibull, Gompertz, lognormal, log-logistic and generalized gamma. Also see help {help stcox} for estimation of proportional hazards models. {p}{cmd:stcurve} is used after {cmd:streg2} to plot the cumulative hazard, survival, and hazard functions at the mean value of the covariates or at values specified by the {cmd:at()} options. {title:Options for {cmd:streg2}} {p 0 4}{cmd:dist(}{it:distname}{cmd:)} specifies the survival model to be estimated. If {cmd:dist(}{it:distname}{cmd:)} is not specified, {cmd:streg2} will use the same distribution as the previous time {cmd:streg2} was used or, if there is no previous time, it will issue an error message. {p 4 4}If {cmd:frailty()} is specified, then {cmd:dist()} and {cmd:frailty()} will carry over to each subsequent estimation if neither is specified then. However, if in a subsequent estimation {cmd:dist()} is specified without {cmd:frailty()}, then a model without frailty is estimated. {p 0 4}{cmd:nohr} specifies that coefficients rather than exponentiated coefficients are to be displayed or, said differently, coefficients rather than hazard ratios are displayed. This option is valid only for models with a proportional hazard ratio parameterization: exponential, Weibull, and Gompertz. {p 4 4}{cmd:nohr}, which can be specified when the model is estimated or when redisplaying results, states that the underlying log relative hazard coefficients are to be displayed. This option affects only how results are displayed, not how they are estimated. {p 0 4}{cmd:time} specifies that the model is to be estimated in the accelerated failure-time metric rather than in the log relative-hazard metric. This option is only valid for the exponential and Weibull models since they have both a hazard ratio and an accelerated failure-time parameterization. For these two models, in the log relative-hazard metric, estimates of (B,s) are produced and in the accelerated failure-time metric, estimates of (-B*s,s) are produced. {p 4 4}Regardless of metric, the likelihood function is the same and models are equally appropriate viewed in either metric; it is just a matter of changing interpretation. {p 4 4}{cmd:time} must be specified when the model is estimated. {p 0 4}{cmd:tr} is appropriate only for the log-logistic, lognormal, and gamma models, or for the exponential and Weibull models when estimated in log expected time. {cmd:tr} specifies that exponentiated coefficients are to be displayed, which have the interpretation of time ratios. {p 4 4}{cmd:tr} may be specified when the model is estimated or when results are redisplayed. {p 0 4}{cmd:level(}{it:#}{cmd:)} specifies the confidence level, in percent, for confidence intervals. The default is {cmd:level(95)} or as set by {cmd:set level}; see help {help level}. {p 0 4}{cmd:robust} specifies that the robust method of calculating the variance-covariance matrix is to be used instead of the traditional calculation. If you specify {cmd:robust}, and if you have previously {cmd:stset} an {cmd:id()} variable, the {cmd:robust} calculation will be clustered on the {cmd:id()} variable. {p 4 4}We especially recommend that you specify {cmd:robust} if you have {cmd:stset} an {cmd:id()} variable because the assumption that justifies the conventional variance estimate -- the independence of the observations -- is presumably false. {p 0 4}{cmd:cluster(}{it:varname}{cmd:)} implies {cmd:robust} and specifies a variable on which clustering is to be based. This overrides the default clustering, if any. {p 0 4}{cmd:score(}{it:newvar(s)}{cmd:)} creates {it:newvar(s)} containing the contribution to the scores. One new variable is specified in the case of an exponential model, two variables are specified for Weibull, lognormal, Gompertz, and log-logistic models, and three new variables are specified in the case of gamma. {p 4 4}The first new variable will contain d(ln L_j)/d(x_jb). {p 4 4}The second and third new variables, if they exist, will contain d(ln L_j) with respect to the second and third ancillary parameters. {p 0 4}{cmd:ancillary(}{it:varlist}{cmd:)} specifies that the ancillary parameter for the Weibull, lognormal, Gompertz, and log-logistic distributions and the first ancillary parameter (sigma) of the generalized log-gamma distribution are to be estimated as a linear combination of {it:varlist}. This option is not available if {cmd:frailty()} is specified. {p 0 4}{cmd:anc2(}{it:varlist}{cmd:)} specifies that the second ancillary parameter (kappa) for the generalized log-gamma distribution is to be estimated as a linear combination of {it:varlist}. This option is not available if {cmd:frailty()} is specified. {p 0 4}{cmd:strata(}{it:varname}{cmd:)} specifies a stratification variable. Observations with equal values of the variable are assumed to be in the same stratum. Stratified estimates (equal coefficients across strata but intercepts and ancillary parameters unique to each stratum) are then estimated. This option is not available if {cmd:frailty()} is specified. {p 0 4}{cmd:frailty(gamma} | {cmd:invgaussian)} specifies the assumed distribution of the observation level frailty or heterogeneity. The estimated model will, in addition to the standard parameter estimates, produce an estimate of the variance of the frailties and a likelihood-ratio test of the null hypothesis that this variance is zero. When this null hypothesis is true, the model reduces to the model with {cmd:frailty()} not specified. {p 0 4}{cmd:shared(}{it:varname}{cmd:)} specifies a shared frailty variable. Observations with equal values of the variable are assumed to be in the same group and thus assumed to share the same frailty. {cmd:shared()} is not allowed with {cmd:dist(gamma)}, that is, when fitting a gamma regression model, but is allowed (of course) with {cmd:frailty(gamma)}, that is, when assuming a gamma distribution for the frailty. If {cmd:frailty()} is specified without {cmd:shared()} then the frailties are assumed to be at the subject level rather than shared between subjects. {p 0 4}{cmd:noshow} prevents {cmd:streg2} from displaying the identities of the key st variables above its output. If this appeals to you, consider typing "{cmd:stset, noshow}" to make {cmd:noshow} the default for all st commands; see help {help stset}. {p 0 4}{cmd:constraints(}{it:numlist}{cmd:)} specifies the linear constraints to be applied during estimation. Constraints are defined using the {cmd:constraint} command and are numbered; see help {help constraint}. The default is to perform unconstrained estimation. {p 0 4}{cmd:cmd} displays the underlying command that {cmd:streg2} would execute but it does not estimate the model. {p 0 4}{cmd:nolog} prevents {cmd:streg2} from showing the iteration log. {p 0 4}{it:maximize_options} control the maximization process; see help {help maximize}. You should never have to specify them. {title:Options for {cmd:stcurve}} {p 0 4}{cmd:cumhaz} requests that the cumulative hazard function be plotted. {p 0 4}{cmd:survival} requests that the survival function be plotted. {p 0 4}{cmd:hazard} requests that the hazard function be plotted. {p 0 4}{cmd:range(}{it:starttime endtime}{cmd:)} specifies the range of the time axis to be plotted. If this option is not specified, {cmd:stcurve} will plot the desired curve on an interval expanding from the earliest to the latest time in the data. {p 0 4}{cmd:at(}{it:varname}{cmd:=}{it:# ...}{cmd:)} requests that the covariates specified by {it:varname} be set to the value of {it:#}. By default {cmd:stcurve} evaluates the function by setting each covariate to its mean value. This option causes the function to be evaluated at the value of the covariates listed in {cmd:at()} and at the mean of all non-listed covariates. {p 0 4}{cmd:at1(}{it:varname}{cmd:=}{it:# ...}{cmd:)}, {cmd:at2(}{it:varname}{cmd:=}{it:# ...}{cmd:)}, ..., {cmd:at10(}{it:varname}{cmd:=}{it:# ...}{cmd:)} specify that multiple curves (up to 10) are to be plotted on the same graph. {cmd:at1()}, {cmd:at2()}, ..., {cmd:at10()} work like the {cmd:at()} option: the option causes the function to be evaluated at the value of the covariates specified and at the mean of all unlisted covariates. {cmd:at1()} specifies the values of the covariates for the first curve, {cmd:at2()} specifies the values of the covariates for the second curve, and so on. {p 0 4}{cmd:outfile(}{it:filename} [{cmd:,} {cmd:replace}]{cmd:)} saves in {it:filename} the values used to plot the curve(s). {p 0 4}{it:graph_options} are most of the options allowed with {cmd:graph, twoway}; see help {help grtwoway}. {title:Options for {help predict}} {p 0 4}{cmd:median time}, the default, calculates the predicted median survival time in analysis time units. Note that this is the prediction from time 0 conditional on constant covariates. {p 0 4}{cmd:median lntime} calculates the natural logarithm of what {cmd:median time} produces. {p 0 4}{cmd:mean time} calculates the predicted mean survival time in analysis time units. Note that this is the prediction from time 0 conditional on constant covariates. This option is neither available for the Gompertz and gamma regressions, nor when {cmd:frailty()} is used. {p 0 4}{cmd:mean lntime} calculates the mean of the natural logarithm of time. This option is neither available for Gompertz regression, nor when {cmd:frailty()} is used. {p 0 4}{cmd:hazard} calculates the predicted hazard. {p 0 4}{cmd:hr} calculates the predicted hazard ratio (excludes estimated intercept). This option is only available for the exponential, Weibull, and Gompertz models. {p 0 4}{cmd:xb} calculates the linear prediction. {p 0 4}{cmd:stdp} calculates the standard error of the linear prediction. {p 0 4}{cmd:surv} calculates the predicted S(t|t0). If you did not specify t0() when you estimated the model, t0=0 and thus {cmd:surv} calculates the predicted survivor function at the time of failure or censoring, S(t). Otherwise, it is the probability of surviving through t given survival through t0. {p 0 4}{cmd:csnell} calculates the (partial) Cox-Snell residual. If you have single observations per subject, then {cmd:csnell} calculates the usual Cox-Snell residual. Otherwise, {cmd:csnell} calculates the additive contribution of this observation to the subject's overall Cox-Snell residual. {p 0 4}{cmd:mgale} calculates the (partial) martingale-like residual. The issues are are the same as with {cmd:csnell} above. {p 0 4}{cmd:deviance} calculates the deviance residual. In the case of multiple-record data, only one value per subject is calculated and it is placed on the last record for the subject. {p 0 4}{cmd:csurv} calculates the predicted S(t|earliest t0) for each subject in multiple-record data. This is based on calculating the conditional survivor values S(t|t0) (see option {cmd:csurv} above) and then multiplying them together. {p 0 4}{cmd:ccsnell} calculates the (cumulative) Cox-Snell residual in multiple-record data. This is based on calculating the partial Cox-Snell residuals (see option {cmd:csnell} above) and then summing them. Only one value per subject is recorded -- the overall sum -- and it is placed on the last record for the subject. {p 0 4}{cmd:cmgale} calculates the (cumulative) martingale-like residual in multiple-record data. This is based on calculating the partial martingale-like residuals (see option {cmd:mgale} above) and then summing them. Only one value per subject is recorded -- the overall sum -- and it is placed on the last record for the subject. {title:Examples} {p 8 12}{inp:. stset failtime, failure(died) id(serno)}{p_end} {p 8 12}{inp:. streg2 load bearings, dist(weibull)}{p_end} {p 8 12}{inp:. streg2 load bearings, dist(lognormal) robust}{p_end} {p 8 12}{inp:. streg2 load bearings, dist(weibull) frailty(gamma) shared(batch)}{p_end} {p 8 12}{inp:. streg2, dist(gamma)} {p 8 12}{inp:. stcurve, cumhaz}{p_end} {p 8 12}{inp:. stcurve, cumhaz at(drug=1)}{p_end} {p 8 12}{inp:. stcurve, cumhaz at(drug=1, age=40)}{p_end} {p 8 12}{inp:. stcurve, cumhaz at(drug=1, age=40) range(0,50)} {title:Also see} {p 1 14}Manual: {hi:[U] 23 Estimation and post-estimation commands},{p_end} {p 10 14}{hi:[U] 29 Overview of model estimation in Stata},{p_end} {hi:[R] st streg} {p 0 19}On-line: help for {help constraint}, {help est}, {help postest}; {help st}, {help sts}, {help stset}, {help stcox}{p_end}