The input data for the survivalanalysis features are duration records: each observation records a span of time over which the subject was observed, along with an outcome at the end of the period. There can be one record per subject or, if covariates vary over time, multiple records.
You can obtain simple descriptions:
. webuse cancer (Patient Survival in Drug Trial) . stset study died failure event: died != 0 & died < . obs. time interval: (0, studytime] exit on or before: failure
48 total observations 0 exclusions 
48 observations remaining, representing 31 failures in singlerecord/singlefailure data 744 total analysis time at risk and under observation at risk from t = 0 earliest observed entry t = 0 last observed exit t = 39 
per subject  
Category total mean min median max  
no. of subjects 48  
no. of records 48 1 1 1 1  
(first) entry time 0 0 0 0  
(final) exit time 15.5 1 12.5 39  
subjects with gap 0  
time on gap if gap 0  
time at risk 744 15.5 1 12.5 39  
failures 31 .6458333 0 1 1  


dose  time at risk rate subjects 25% 50% 75%  
Control  180 .1055556 20 4 8 12  
5 mg  209 .0287081 14 13 22 23  
10 mg  355 .0169014 14 25 33 .  
total  744 .0416667 48 8 17 33 
You can also compare the survivor functions
. sts list, by(dose) compare failure _d: died analysis time _t: studytime


dose Control 5 mg 10 mg  
time 1 0.9000 1.0000 1.0000  
5 0.6000 1.0000 1.0000  
9 0.4500 0.8512 0.9286  
13 0.2250 0.7448 0.8571  
17 0.1125 0.6207 0.8571  
21 0.1125 0.6207 0.8571  
25 . 0.2069 0.6857  
29 . 0.2069 0.5878  
33 . . 0.4408  
37 . . 0.4408  
41 . . .  
or you can review the complete life table:
. sts list, by(dose) failure _d: died analysis time _t: studytime
Beg. Net Survivor Std. Time Total Fail Lost Function Error [95% Conf. Int.] 
Control 1 20 2 0 0.9000 0.0671 0.6560 0.9740 2 18 1 0 0.8500 0.0798 0.6038 0.9490 3 17 1 0 0.8000 0.0894 0.5511 0.9198 4 16 2 0 0.7000 0.1025 0.4505 0.8525 (output omitted) 23 1 1 0 0.0000 . . . 5 mg 6 14 1 1 0.9286 0.0688 0.5908 0.9896 7 12 1 0 0.8512 0.0973 0.5234 0.9607 9 11 0 1 0.8512 0.0973 0.5234 0.9607 10 10 0 1 0.8512 0.0973 0.5234 0.9607 (output omitted) 32 1 0 1 0.2069 0.1769 0.0104 0.5804 10 mg 6 14 1 0 0.9286 0.0688 0.5908 0.9896 10 13 1 0 0.8571 0.0935 0.5394 0.9622 17 12 0 1 0.8571 0.0935 0.5394 0.9622 19 11 0 1 0.8571 0.0935 0.5394 0.9622 (output omitted) 39 1 0 1 0.4408 0.1673 0.1312 0.7187 
Just as easily, you can obtain a graph
. sts graph, by(dose)
or test the equality of the survivor functions:
. sts test dose failure _d: died analysis time _t: studytime
Logrank test for equality of survivor functions 
Events Events  
dose  observed expected  
Control  19 7.25  
5 mg  6 8.20  
10 mg  6 15.56  
Total  31 31.00 
We used the logrank test, but we could have specified the Wilcoxon–Breslow–Gehan test, the Tarone–Ware test, the Peto–Peto–Prentice test, or the Fleming–Harrington test.
We could also perform stratified versions of these tests to control for an external covariate:
. generate agecat = 1 . replace agecat = 2 if age > 55 (25 real changes made) . replace agecat = 3 if age > 60 (11 real changes made) . tabulate agecat
agecat  Freq. Percent Cum.  
1  23 47.92 47.92  
2  14 29.17 77.08  
3  11 22.92 100.00  
Total  48 100.00 
Stratified logrank test for equality of survivor functions 
Events Events  
dose  observed expected(*)  
Control  19 7.37  
5 mg  6 9.67  
10 mg  6 13.95  
Total  31 31.00 
Stata can fit Cox proportional hazards, exponential, Weibull, Gompertz, lognormal, loglogistic, and gamma models. In the case of the Cox proportional hazards model,
The same is true of the parametric models. For exponential and Weibull models, estimates are available in either the acceleratedtime or hazard metric.
Here we will focus on the Cox proportional hazards model using a model fitted on our dose–age data that we described above:
. stcox age i.dose failure _d: died analysis time _t: studytime Iteration 0: log likelihood = 99.911448 Iteration 1: log likelihood = 82.331523 Iteration 2: log likelihood = 81.676487 Iteration 3: log likelihood = 81.652584 Iteration 4: log likelihood = 81.652567 Refining estimates: Iteration 0: log likelihood = 81.652567 Cox regression  Breslow method for ties No. of subjects = 48 Number of obs = 48 No. of failures = 31 Time at risk = 744 LR chi2(3) = 36.52 Log likelihood = 81.652567 Prob > chi2 = 0.0000
_t  Haz. Ratio Std. Err. z P>z [95% Conf. Interval]  
age  1.118334 .0409074 3.06 0.002 1.040963 1.201455  
dose  
5 mg  .1805839 .0892742 3.46 0.001 .0685292 .4758636  
10 mg  .0520066 .034103 4.51 0.000 .0143843 .1880305  
By default, stcox uses Breslow’s method for handling ties and presents results as hazard ratios—that is, exponentiated coefficients—but we can see the underlying coefficients if we wish:
. stcox, nohr Cox regression  Breslow method for ties No. of subjects = 48 Number of obs = 48 No. of failures = 31 Time at risk = 744 LR chi2(3) = 36.52 Log likelihood = 81.652567 Prob > chi2 = 0.0000
_t  Coef. Std. Err. z P>z [95% Conf. Interval]  
age  .11184 .0365789 3.06 0.002 .0401467 .1835333  
dose  
5 mg  1.71156 .4943639 3.46 0.001 2.680495 .7426241  
10 mg  2.956384 .6557432 4.51 0.000 4.241617 1.671151  
stcox can also handle ties using Efron’s method, the exact partiallikelihood method, or the exact marginallikelihood method.
We can as easily fit the model with robust estimates of variance:
. stcox age i.dose, vce(robust) failure _d: died analysis time _t: studytime Iteration 0: log pseudolikelihood = 99.911448 Iteration 1: log pseudolikelihood = 82.331523 Iteration 2: log pseudolikelihood = 81.676487 Iteration 3: log pseudolikelihood = 81.652584 Iteration 4: log pseudolikelihood = 81.652567 Refining estimates: Iteration 0: log pseudolikelihood = 81.652567 Cox regression  Breslow method for ties No. of subjects = 48 Number of obs = 48 No. of failures = 31 Time at risk = 744 Wald chi2(3) = 32.39 Log pseudolikelihood = 81.652567 Prob > chi2 = 0.0000
Robust  
_t  Haz. Ratio Std. Err. z P>z [95% Conf. Interval]  
age  1.118334 .0327643 3.82 0.000 1.055926 1.18443  
dose  
5 mg  .1805839 .0773571 4.00 0.000 .0779917 .4181288  
10 mg  .0520066 .0349232 4.40 0.000 .0139465 .1939333  
By using predict after stcox, we can obtain the following:
The data used above have censored observations but no timevarying covariates and no left truncation. The failure event—death—occurs only once. Had the data included timevarying covariates, left truncation, or recurring failure events, however, nothing would have changed in terms of what we type, and importantly, all the same features are available postestimation regardless of the characteristics of the data.
We can also fit Cox (as well as parametric) models with random effects.
Known as
. webuse catheter, clear (Kidney data, McGilchrist and Aisbett, Biometrics, 1991) . stset time infect failure event: infect != 0 & infect < . obs. time interval: (0, time] exit on or before: failure
76 total observations 0 exclusions 
76 observations remaining, representing 58 failures in singlerecord/singlefailure data 7,424 total analysis time at risk and under observation at risk from t = 0 earliest observed entry t = 0 last observed exit t = 562 
_t  Haz. Ratio Std. Err. z P>z [95% Conf. Interval]  
age  1.006202 .0120965 0.51 0.607 .9827701 1.030192  
female  .2068678 .095708 3.41 0.001 .0835376 .5122756  
theta  .4754497 .2673107  