Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Discrete time hazard models using cloglog and svy

From   Steve Samuels <>
Subject   Re: st: Discrete time hazard models using cloglog and svy
Date   Tue, 25 Jun 2013 11:36:13 -0400

Oops.  Those options with the missing round brackets are:
vce(cluster personid)
vce(bootstrap, cluster(personid))
vce(jackknife, cluster(personid))

Both the bootstrap and jackknife have options other than cluster().


I should expand this a bit:  -cloglog- in the non-survey setting also does not
require an id() option even though individuals appear in the denominator of
every time period through the end of observation. This is directly analogous to
the setup of a continuous outcome Cox model, where individuals appear in
every risk set up to the end of observation. The likelihood-based tests and standard
errors are based on the assumption that, conditional on covariates and on time-interval, the events in that interval are independent. 

If individuals can experience multiple failures, then, then Angelo should use vce(cluster personid), vce(jackknife, cluster(personid) or vce(bootstrap, cluster(personid). 


A survey analysis estimates standard errors & CIs from the variation between
PSUs. As a result you don't need to tell -svyset- or the -svy- prefix command anything about the multiple appearances of individuals. 


On Jun 20, 2013, at 7:18 PM, Angelo Belardi wrote:

Dear All,

I am doing discrete time proportional hazard models using 'cloglog' in
a person-period formatted dataset. The p-p data was created using
'prsnperd', a function from the 'dthaz'-package written by Alexis
Dinno for discrete-time survival and hazard models. This package also
includes the 'dthaz' function to estimate the hazard probabilities, I
would however like to use 'cloglog' instead, because I want to work
with 'svy' to account for the complex survey structure of the data.

SVY was set up with the primary sampling unit, weight variable and the
variable identifying the strata. When running the estimations with
this, cloglog runs over all cases in the person-period dataset. I
think that I should at some point tell 'svy' or the 'cloglog' command,
that for each subject in the sample there are several lines and
therefore include the id-variable that identifies which cases belong
to each subject.
How can I include such an identifier-variable in my estimation so that
'svy' works correctly?

Thanks already for any input and I will gladly provide more
information if neccessary.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index