Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: Multi-level discrete time survival analysis

 From 0 1 To statalist@hsphsun2.harvard.edu Subject st: Multi-level discrete time survival analysis Date Sat, 26 Nov 2011 15:26:30 -0500

```I am trying to compare 50 treatment facilities on the likelihood of
their children “graduating” from the program within 3 years (36
months). I would also like to control for the “speed” at which their
children graduate, using length of stay (LOS) in discrete time
intervals of 0-5 mo, 6-11 mo, 12-23 mo, and 24-36 mo. (Children still
in treatment after 36 months would be censored.) So, basically: Which
programs discharge the most children within 3 years, and do it the
fastest? My thought is that I could use each facility's coefficient
(or some other output) to "rank" them.

Is a (multi-level) discrete time survival analysis the best approach
to address this question? If so, would a multi-level discrete model in
STATA yield a single coefficient for each facility that would reflect
that facility's "performance" related to the likelihood and speed of
graduating children by 36 months (controlling for age, etc.)?

Some more details about the data:

1. I have 3 years of data on about 50,000 children (about 330 per
facility, per year). Although I have LOS in days (from date of entry
to date of graduation), with these data I understand it’s better to
treat time in discrete intervals like the ones I listed.  This is
because ties are common: a lot of kids tend to leave at the end of the
month, or end of the year, etc., due in part to insurance rules. Plus,
many treatment strategies are built around these intervals, so they
have an important meaning.

2. My data are arranged so each child has one record for each discrete
time period that he was observed (i.e., person-period format).

3. My event variable is GRADUATE (1 = Yes, 0 = No). The child is
censored if his LOS exceeds 36 months, or if he is still in treatment
when data collection stops.

4. My time variables are four dummy variables (d1, d2, d3, d4) that
represent the LOS intervals (0-5 mo, 6-11, mo, etc.) with a “1” if the
child was observed in that interval and a 0 for the remaining.

5. I  also have some covariates that I would like to control for:

STARTYEAR – Year child entered the program (each year is a cohort of children)
AGE – Age of child when he entered

###

Thank you for any insights. I'm new to STATA and new to survival
analysis, so bear with me.

HH

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```