Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Survival analysis: repeated spells


From   "E. Paul Wileyto" <epw@mail.med.upenn.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Survival analysis: repeated spells
Date   Fri, 25 Sep 2009 14:25:42 -0400

You have some choices to make for modeling recurrent events. Stata has many utilities for structuring the risk-set for survival modeling, especially for multiple record data. Best thing is to go to the survival manual for Stata, and look up the methods and formulas section in STREG. Here is how you stated the log-likelihood...

ln L(i)=  c log h(t)+log S(t)

This should actually be

ln L(i)=  c log h(t)+ (1-c) log S(t)

The manual adds one more term to make likelihood conditional on time of entry into the risk set.

ln L(i)=  c log h(t)+ (1-c) log S(t) - S(t0)

You will have to make a choice about how to represent time. For models like Anderson-Gill, the only time=0 is when you enter the cohort at the beginning. The time for your first event becomes the entry time for the second event, and so on. Gap time models reset the clock back to zero after each event, so t for later events becomes (ti-t0), and the S(t0) term in the likelihood goes away.

Beyond that, you will have to adjust at least the standard errors, either by some sort of clustering approach (vce(cluster) or bootstrap), or using shared frailty. I will say that STSET is unfriendly to certain types of multiple record data. I have had bad luck trying to STSET data for gap time, and usually have to do something heroic that side-steps STSET's "idiot proofing." That is, I either have to create my multiple records with an Anderson-Gill type structure, and then subtract out the t0 values, or I have to treat it as single record data... STSET only generates the risk-set representation of the data, and clustering or shared frailty will work without giving the IDs to STSET.

Paul







V. Martini wrote:
Hello,

this is maybe a trivial question for some of you:

how does stata account for the presence of repeated spells?

I know that, in order to account for different spells that refers to the same unit, stata requires to specify the id( ) option when the data are stset-ted. But my question is about how stata deals with this problem in the estimation.

For instance, suppose the contribution to the loglikelihood (with right censoring) is given by:

L(i)=  c log h(t)+log S(t)

where c is a censoring variable (c=1 if spell is complete), h is the hazard function and S the survival function.

How the contribution to the likelihood changes when unit i has repeated spells? In my case this is relevant because I have to program the likelihood function by myself.


Many thanks,

Vinicio
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


--
E. Paul Wileyto, Ph.D.
Assistant Professor of Biostatistics
Tobacco Use Research Center
School of Medicine, U. of Pennsylvania
3535 Market Street, Suite 4100
Philadelphia, PA  19104-3309

215-746-7147
Fax: 215-746-7140
epw@mail.med.upenn.edu
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index