Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Carlo Lazzaro" <carlo.lazzaro@tin.it> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: R: Stset-ing Multiple Failure/Multiple Spell Data : Moving in and out of risk set |
Date | Tue, 15 Mar 2011 10:16:13 +0100 |
Dear Kathleen, Laura and Steve gave sound advices, especially as far as the need for revising the way time should be reported is concerned. As an aside, I suppose that one of the problem you're facing concerns the time when subjects exit the survival analysis (SA). This is due to the multiple failures they can come across (ie, switching from self-employed to non-self-employed working status). The following example (performed in Stata 9.2/SE) considers this issue: --------------- exampe begins ------------------------------------ set obs 6 g id = 1 in 1/2 replace id=2 in 3/4 replace id=3 in 5/6 g In=0 replace In=6 in 2 replace In=3 in 4 replace In=4 in 6 g Out=1 replace Out=7 in 2 replace Out=8 in 4 replace Out=5 in 6 g No_Self_Employed=1 replace No_Self_Employed=0 in 4 stset Out, id(id) failure(No_Self_Employed==1)time0(In) exit(No_Self_Employed==2) origin(time In) stdes --------------- exampe ends ------------------------------------ In the previous code subjects do not live the SA at the first failure (ie No_Self_Employed==1)- since it would conflate with the assumption of multiple failures - but when the event No_Self_Employed==2 comes alive (and this event will never occurr). As I can see from your thread and previous replies, your subjects do show gaps. You can check whether gaps are consistent with your methodological expectations using - stdes -. For more on this topic, I would refer you to: MA Cleves, WW Gould, RG Gutierrez. An intoduction to survival analysis using Stata. Revised edition. College Station: Stata Press, 2004: 59-62.The same textbook (147-156)also offers interesting insights on Cox model with shared frailty, that may fit your data; the already referenced http://www.stata.com/support/faqs/stat/stmfail.html. HTH and Kind Regards, Carlo -----Messaggio originale----- Da: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Kathleen Bui Inviato: domenica 13 marzo 2011 16.31 A: statalist@hsphsun2.harvard.edu Oggetto: st: Stset-ing Multiple Failure/Multiple Spell Data : Moving in and out of risk set My question is how to stset a multiple failure data set when an individual can move in and out of the risk set. I have read Cleves?s An Introduction to Survival Analysis Using Stata, Cleve?s STB-49, and all previous posts concerning st-setting multiple failures. Others have asked similar questions as mine, but I have yet to find a solution that works. I am analyzing the duration of an individual?s stay in Self-Employment. Failure will be exit from self-employment. My question is how can I stset the data so that Stata recognizes that an individual can move into and out of the risk set (which is being Self-Employed). To be more explicit, for each individual in my data set, I have information as to whether or not they are Self-Employed. The issue arises when an individual has a self employment history as follows: The individual is self-employed and therefore at risk of failure. Then they fail (leave self employment) and enter waged employment. By entering waged employment, they are no longer at risk of failing, since they are no longer Self-Employed. However, after a period of time, they once again become Self Employed (thus re-enter the risk set) and fail once again (their second failure). As a result, multiple failures are possible as individuals are moving in and out of different employment states. However, although I understand that Stata can recognize multiple failures, I am unsure of how stset can be used to recognize the multiple spells of Self-Employment, particularly the period of time between spells when the individual is no longer at risk. Specifically, I am unable to set the analysis time back to 0 for when the individual begins a second period at risk after being not at risk. For example, one individual in my data set of multiple individuals can look like: +----------------------------------------------------------------------+ | ID Year0 Year SelfEmploy Failure | |--------------------------------------------------------------------| 1. | 1 1989 1990 0 0 | 2. | 1 1990 1991 1 0 | 3. | 1 1991 1992 1 0 | 4. | 1 1992 1993 1 0 | 5. | 1 1993 1994 1 0 | 6. | 1 1994 1995 0 1 | 7. | 1 1995 1996 0 0 | 8. | 1 1996 1997 1 0 | 9. | 1 1997 1998 1 0 | 10. | 1 1998 1999 1 0 | 11. | 1 1999 2000 0 1 | +-------------------------------------------------------------------+ where ?SelfEmploy? is the indicator variable denoting whether or not the individual is self employed, ?Failed? is an indicator variable denoting if the individual has left self employment and year0 and year are the corresponding beginning and end of time period. So between, 1990 and 1994, the individual is at risk of failing, and fails between 1994 and 1995. But between 1995 and 1996, they are no longer at risk of failing (say they are employed in the waged sector). But then they enter self employment in 1996 and thus experience another failure between in 1999-2000. Is there a command in stset that allows Stata to ?ignore? the periods when they are no longer at risk? For example, when I stset my data as follows: stset year, origin(SelfEmploy==1) failure(Failed) time0(Year0) id(PersonID) exit(time .), the period when they are no longer at risk of failing is treated as if they are in self-employment as the output I receive is: +--------------------------------------------------------------------------- ------------- + | ID Year0 Year SelfEmploy Failure _s _d _t0 _t | |----------------------------------------------------------------------- ---------------------| 1. | 1 1989 1990 0 0 0 0 . . | 2. | 1 1990 1991 1 0 0 0 . . | 3. | 1 1991 1992 1 0 1 0 0 1 | 4. | 1 1992 1993 1 0 1 0 1 2 | 5. | 1 1993 1994 1 0 1 0 2 3 | 6. | 1 1994 1995 0 1 1 1 3 4 | 7. | 1 1995 1996 0 0 1 0 4 5 | 8. | 1 1996 1997 1 0 1 0 5 6 | 9. | 1 1997 1998 1 0 1 0 6 7 | 10.| 1 1998 1999 1 0 1 0 7 8 | 11.| 1 1999 2000 0 1 1 1 8 9 | +--------------------------------------------------------------------------- ----------------+ Stata seems to count the period form 1995-1996,as a time where the individual is at risk of failing, when he is not. Therefore, am unsure as to how to st-set the data so that from 1995-1996, Stata recognizes that the individual is no longer at risk of failing and that my analysis time can be ?Reset? to 0 for when the individual begins a second period at risk after being not at risk. Any suggestions? Any help would be appreciated! Thanks!! Kathleen Bui * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/