capture log close set more 1 log using stb, replace di "stb started on $S_DATE at $S_TIME" ************************************************** * STB.DO/LOG: file illustrating -pgmhaz- in action ************************************************** discard clear /* Reset matsize to handle big model */ set matsize 350 /* CANCER.DTA: data set used in [5s] -cox- examples, vol 2, p271; also in 5.0 Ref Manual P-Z, p. 257 */ use c:\stata\cancer ge id = _n /* recode drug vble so like Manual example */ replace drug = 0 if drug == 1 replace drug = 1 if drug > 1 su list id studytim died drug age in 1/4 /* Expand data set so that there's one data row per person per period at risk and create some of the variables required by -pgmhaz- */ expand studytim sort id quietly by id: ge seqvar = _n quietly by id: ge dead = died & _n==_N list id studytim seqvar died dead age if id <= 4 * Use Cox model estimates as the main reference point for what follows * [The following -stcox- replicates the -cox- estimates given in the * version 4 Reference Manual [5s] cox, Vol 2, p.272] stset seqvar dead, id(id) stcox drug age, nohr baseh(coxbaseh) * baseline hazard shape (see also Version 4.0 Ref Manual Vol 2, p.279) gr coxbaseh studytim, xlab ylab sort seqvar list seqvar coxbaseh if coxbaseh~=. * Note, also for reference, estimates from a continuous time Weibull * model [compare the coefficient on log(duration) in the discrete time * "Weibull" model later, with 1-p derived from the cts time Weibull] stweib drug age, nohr * some variables to summarise duration dependence (see below) ge logd = ln(seqvar) ge d = seqvar ge d_2 = d*d ge d_3 = d_2*d ge d_4 = d_3*d * create 39 dummy vbles, one for each spell month at risk quietly for 1-39, ltype(numeric): ge byte d@ = seqvar == @ compress su /* Caution ------- Before trying to estimate a fully non-parametric duration dependence specification, you must check that there is an "event" (dead = 1) for every value of seqvar in the data. The easiest way is with the following crosstab */ tab seqvar dead /* If there are no events at all during some duration intervals (values of seqvar for which there are no obs with dead==1: see tab), then the person-month observations corresponding to these intervals should be excluded -- see below for an example. [If you don't want to throw obs away, then use a different duration specification, e.g. piece-wise constant (coarser grouping on the duration axis) or a polynomial in seqvar, or some other parametric form.] */ /* Now look at -pgmhaz- in action ... (Compare the coefficients with those from the cox model estimated above, and compare the baseline hazard estimates with the cox model ones shown in the Figure in Version 4.0 Ref Manual Vol 2, p279, or the graph & list generated earlier.) Baseline hazard specifications considered are: (i) "Weibull"-type and polynomial (ii) piecewise constant (iii) non-parametric */ /* "Weibull"-type and polynomial duration dependence specifications ... */ di "$S_DATE at $S_TIME" pgmhaz logd drug age, id(id) s(seqvar) d(dead) di "$S_DATE at $S_TIME" pgmhaz, eform pgmhaz d d_2 d_3 d_4 drug age, id(id) s(seqvar) d(dead) /* piece-wise constant baseline */ ge dur1 = d1+d2+d3+d4+d5+d6 ge dur2 = d7+d8+d9+d10+d11+d12 ge dur3 = d13+d14+d15+d16+d17+d18 ge dur4 = d19+d20+d21+d22+d23+d24 ge dur5 = d25+d26+d27+d28+d29+d30 ge dur6 = d31+d32+d33+d34+d35+d36+d37+d38+d39 di "$S_DATE at $S_TIME" pgmhaz dur2 dur3 dur4 dur5 dur6 drug age, /* */ i(id) s(seqvar) d(dead) trace di "$S_DATE at $S_TIME" pgmhaz dur1 dur2 dur3 dur4 dur5 dur6 drug age, /* */ i(id) s(seqvar) d(dead) nocons trace di "$S_DATE at $S_TIME" * Check -if- option works: first select on fixed covariate tab seqvar dead if age > 50 pgmhaz logd drug age if age > 50, id(id) s(seqvar) d(dead) preserve keep if age > 50 pgmhaz logd drug age, id(id) s(seqvar) d(dead) restore * Now for the non-parametric baseline estimates * [with checks on the -if- option, and equivalence of cons/nocons * specifications] di "$S_DATE at $S_TIME" pgmhaz d1-d8 d10-d13 d15-d17 d22-d25 d28 d33 drug age /* */ if (seqvar>=1 & seqvar<=8) | (seqvar>=10 & seqvar<=13) /* */ | (seqvar>=15 & seqvar<=17) | (seqvar>=22 & seqvar<=25) /* */ | seqvar==28 | seqvar==33 , /* */ i(id) s(seqvar) d(dead) nocons di "$S_DATE at $S_TIME" pgmhaz d2-d8 d10-d13 d15-d17 d22-d25 d28 d33 drug age /* */ if (seqvar>=1 & seqvar<=8) | (seqvar>=10 & seqvar<=13) /* */ | (seqvar>=15 & seqvar<=17) | (seqvar>=22 & seqvar<=25) /* */ | seqvar==28 | seqvar==33 , /* */ i(id) s(seqvar) d(dead) di "$S_DATE at $S_TIME" keep if (seqvar>=1 & seqvar<=8) | (seqvar>=10 & seqvar<=13) /* */ | (seqvar>=15 & seqvar<=17) | (seqvar>=22 & seqvar<=25) /* */ | seqvar==28 | seqvar==33 pgmhaz d1-d8 d10-d13 d15-d17 d22-d25 d28 d33 drug age , /* */ i(id) s(seqvar) d(dead) nocons di "$S_DATE at $S_TIME" pgmhaz d2-d8 d10-d13 d15-d17 d22-d25 d28 d33 drug age, /* */ i(id) s(seqvar) d(dead) di "$S_DATE at $S_TIME" di "stb ended on $S_DATE at $S_TIME" log close