Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: estimating cumulative hazard

From   Steve Samuels <>
Subject   Re: st: estimating cumulative hazard
Date   Thu, 6 Jun 2013 13:28:09 -0400

I'll just add: A little thought experiment shows that Solution 1 can't be
correct. If there are constant interval hazard rates of 10%, they would
add up to > 100% after the 10th interval.



Solution 2 is the correct approach. Solution 1 will be an estimate of
-log(S_T). See Stephen Jenkins's  Survival Analysis book at:
with general material at:


On Jun 5, 2013, at 6:54 PM, Matt Aronson wrote:

Dear Statalisters:

I have longitudinal education data on which I estimated a discrete time
survival model, with the event of interest being completion of a degree.
Based on those results, I want to estimate for each respondent the
probability that s/he *ever completed the degree* during the time period for
which s/he was observed. (I know I could just use a logistic regression
model for "ever completed"; my goal is in fact to compare with that.)

I have two different ideas of how to use the survival model results, and
I'd like to know which one (or neither) of these is right. My problem is
with the conceptual rather than the software aspects. Both of my approaches
start off by using the model results to calculate each respondent's hazard
values at each point in time, h_t.  I don't have a problem with that.

Here are my two approaches:

1)  Sum up each respondent's predicted hazard values over all of her/his T
periods of observation,

Prob(ever completed degree) =  h_1 + h_2 + ... h_T
 where h_t is the model-estimated Prob( completed at time t, given not
 completed by time t-1).

2) Take the complement of the respondent's estimated survival probability
up through the end of the Tth period of observation,

Prob(ever completed degree) = 1 - S_T
 where S_T  = Prob(did not complete degree by time T) =  (1-h_1) *
 (1-h_2) * ...(1-h_T)

Which one of these is right (if either), and why?

Thanks very much for whatever you can offer!
Matt Aronson
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index