Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: R: Stset-ing Multiple Failure/Multiple Spell Data : Moving in and out of risk set


From   Steven Samuels <[email protected]>
To   [email protected]
Subject   Re: st: R: Stset-ing Multiple Failure/Multiple Spell Data : Moving in and out of risk set
Date   Sat, 19 Mar 2011 17:02:47 -0400

Kathleen--

With your data, you are obligated to report that measurement error of *at least* ±1 years is possible in recorded "times" of employment  because dates that self-employment started or stopped in a year are unknown.  Also, report that there is a positive bias in estimates of probabilities that a person stayed self-employed for at least k years. The bias arises because the data don't record instances where people left and returned to self-employment between interviews. So, for example, four consecutive "years" (i.e. interviews) of reported self-employment could be made up of a number of shorter spells.

Status at interview apparently was the only observation actually made, so I suggest that you model that status directly instead of a questionable time variable. Such an analysis would be based on the same data as you'd feed into -stset-.  Model the probability that if a person was self-employed at the year K interview, they were also self-employed  at the year K+1 interview.  In this analysis  the zero is the first interview in a spell of self-empployment, and you index all the subsequent interviews as Nick suggested.  

If your data are based on a complex survey sample, -svyset- your data and use -svy: logistic_.  Failure to do so would invalidate your standard errors and hypothesis tests.

Steve

Steven J. Samuels
Consulting Statistician
18 Cantine's Island
Saugerties, NY 12477 USA
Voice: 845-246-0774
Fax:   206-202-4783 
[email protected]


 

On Mar 19, 2011, at 5:46 AM, Nick Cox wrote:

I don't understand what you are trying to do, but given a
classification of spells by a variable -_spell- then time in each
spell has a minimum

egen Start = min(Year) if _spell, by(PersonId _spell)

so that you just need to subtract that from Year to get a time
variable that starts at 0 in each spell.

Another way to do it is

bysort PersonId _spell (Year) : gen Time = Year - Year[1] if _spell

Nick

On Sat, Mar 19, 2011 at 12:13 AM, Kathleen Bui <[email protected]> wrote:

> Thanks for all the help!
> 
>  I do understand that smaller time intervals would be a much better , but I
> don't have access to any smaller time frame than a year.
> 
> On another note,I was wondering, how do I go about "reseting" the time to zero
> for each spell of self-employment, since I have multiple observations for each
> spell of selfemployment? (If I wanted to employ the PWP time gap model approach)
> 
> 
> 
> For example, following my example before, if I had something that looked like:
> 
> (where the _spell, just indicates what spell of self-employment (first second
> etc)),
> 
> 
> How can I stset the data so the time is "reset" to zero for each new spell?
> 
> 
> +----------------------------------------------------------------------------------+
> 
> 
> PersonID   Year0   Year   Failed   SelfEmploy   _spell
> -------------------------------------------------------------------------------------------
> 
> 
> 1.         1       .      1990         0          0        0
> 2.         1    1990   1991         0          1        1
> 3.         1    1991   1992         0          1        1
> 4.         1    1992   1993         0          1        1
> 5.         1    1993   1994         1          1        1
> 6.         1    1994   1995         0          0        0
> 7.         1    1995   1996         0          0        0
> 8.         1    1996   1997         0          1        2
> 9.         1    1997   1998         0          1        2
> 10.       1    1998   1999         1          1        2
> -------------------------------------------------------
> 11.         1    1999   2000         0          0        0
> 12.         2       .      1993         0          0        0
> 13.         2    1993   1994         0          1        1
> 14.         2    1994   1995         0          1        1
> 15.         2    1995   1996         0          1        1
> -------------------------------------------------------
> 16.         2    1996   1997         1          1        1
> 17.         2    1997   1998         0          0        0
> +-------------------------------------------------------+
> 
> If I do:
> 
> stset Year, origin(SelfEmploy==1) failure(Failed) time0(Year0) id(PersonID)
> exit(time .) if(_spell!=0)
> 
> this doesn't reset the time for the beginning of each spell, rather it continues
> (with time gaps) from the time of the first spell.
> 
> Thanks again! Appreciate the help!
> -Kathleen
> 
> 
> The following example (performed in Stata 9.2/SE) considers this issue:
> --------------- exampe begins ------------------------------------
> set obs 6
> g id = 1 in 1/2
> replace id=2 in 3/4
> replace id=3 in 5/6
> g In=0
> replace In=6 in 2
> replace In=3 in 4
> replace In=4 in 6
> g Out=1
> replace Out=7 in 2
> replace Out=8 in 4
> replace Out=5 in 6
> g No_Self_Employed=1
> replace No_Self_Employed=0 in 4
> stset Out, id(id) failure(No_Self_Employed==1)time0(In)
> exit(No_Self_Employed==2) origin(time In)
> stdes
> --------------- exampe ends ------------------------------------
> 
> In the previous code subjects do not live the SA at the first failure (ie
> No_Self_Employed==1)- since it would conflate with the assumption of
> multiple failures - but when the event No_Self_Employed==2 comes alive (and
> this event will never occurr).
> 
> As I can see from your thread and previous replies, your subjects do show
> gaps. You can check whether gaps are consistent with your methodological
> expectations using - stdes -.
> 
> For more on this topic, I would refer you to:
> MA Cleves, WW Gould, RG Gutierrez. An intoduction to survival analysis using
> Stata. Revised edition. College Station: Stata Press, 2004: 59-62.The same
> textbook (147-156)also offers interesting insights on Cox model with shared
> frailty, that may fit your data;
> the already referenced http://www.stata.com/support/faqs/stat/stmfail.html.
> 
> HTH and Kind Regards,
> Carlo
> -----Messaggio originale-----
> Da: [email protected]
> [mailto:[email protected]] Per conto di Kathleen Bui
> Inviato: domenica 13 marzo 2011 16.31
> A: [email protected]
> Oggetto: st: Stset-ing Multiple Failure/Multiple Spell Data : Moving in and
> out of risk set
> 
> My question is how to stset a multiple failure data set when an individual
> can
> move in and out of the risk set.
> 
> I have read Cleves’s An Introduction to Survival Analysis Using Stata,
> Cleve’s
> STB-49, and all previous posts concerning st-setting multiple failures.
> Others
> have asked similar questions as mine, but I have yet to find a solution that
> 
> works.
> 
> I am analyzing the duration of an individual’s stay in Self-Employment.
> Failure
> will be exit from self-employment.  My question is how can I stset the data
> so
> that Stata recognizes that an individual can move into and out of the risk
> set
> (which is being Self-Employed).
> 
> To be more explicit, for each individual in my data set, I have information
> as
> to whether or not they are Self-Employed.  The issue arises when an
> individual
> has a self employment history as follows:
> 
> The individual is self-employed and therefore at risk of failure.  Then they
> 
> fail (leave self employment) and enter waged employment. By entering waged
> employment, they are no longer at risk of failing, since they are no longer
> Self-Employed. However, after a period of time, they once again become Self
> Employed (thus re-enter the risk set) and fail once again (their second
> failure).
> 
> As a result, multiple failures are possible as individuals are moving in and
> out
> of different employment states. However, although I understand that Stata
> can
> recognize multiple failures, I am unsure of how stset can be used to
> recognize
> the multiple spells of Self-Employment, particularly the period of time
> between
> spells when the individual is no longer at risk.
> 
> Specifically, I am unable to set the analysis time back to 0 for when the
> individual begins a second period at risk after being not at risk.
> 
> For example, one individual in my data set of multiple individuals can look
> like:
> 
>    +----------------------------------------------------------------------+
>        | ID   Year0   Year   SelfEmploy     Failure         |
> 
> |--------------------------------------------------------------------|
> 1.    |  1    1989    1990        0                  0            |
> 2.    |  1    1990    1991        1                  0            |
> 3.    |  1    1991    1992        1                  0            |
> 4.     |  1    1992    1993        1                  0            |
> 5.     |  1    1993    1994        1                  0            |
> 6.     |  1    1994    1995        0                  1            |
> 7.     |  1    1995    1996        0                  0            |
> 8.     |  1    1996    1997        1                  0            |
> 9.     |  1    1997    1998        1                  0            |
> 10.   |  1    1998    1999        1                  0            |
> 11.   |  1    1999    2000        0                  1            |
>        +-------------------------------------------------------------------+
> 
> where “SelfEmploy” is the indicator variable denoting whether or not the
> individual is self employed, “Failed” is an indicator variable denoting if
> the
> 
> individual has left self employment and year0 and year are the corresponding
> 
> beginning and end of time period.
> 
> So between, 1990 and 1994, the individual is at risk of failing, and fails
> between 1994 and 1995. But between 1995 and 1996, they are no longer at risk
> of
> 
> failing (say they are employed in the waged sector). But then they enter
> self
> employment in 1996 and thus experience another failure between in 1999-2000.
> 
> Is there a command in stset that allows Stata to “ignore” the periods when
> they
> are no longer at risk?
> 
> For example, when I stset my data as follows: stset year,
> origin(SelfEmploy==1)
> failure(Failed)  time0(Year0)  id(PersonID) exit(time .), the period when
> they
> are no longer at risk of failing is treated as if they are in
> self-employment as
> the output I receive is:
> 
> 
> +---------------------------------------------------------------------------
> ------------- +
> 
>     | ID   Year0   Year   SelfEmploy   Failure   _s   _d       _t0    _t
> |
>     |-----------------------------------------------------------------------
> ---------------------|
> 
> 1. |  1    1989    1990       0               0         0         0
> .
> .     |
> 2. |  1    1990    1991       1              0         0        0       .
> 
> .     |
> 3. |  1    1991    1992       1              0         1        0        0
> 
>  1    |
> 4. |  1    1992    1993       1              0         1        0        1
>   2    |
> 5. |  1    1993    1994       1              0         1        0        2
> 
>  3    |
> 6. |  1    1994    1995       0               1         1        1
> 3
>  4    |
> 7. |  1    1995    1996       0               0         1         0
> 4
>  5    |
> 8. |  1    1996    1997       1              0         1         0
> 5
>  6    |
> 9. |  1    1997    1998       1               0         1         0
> 6
>  7    |
> 10.|  1    1998    1999       1              0         1        0        7
> 
>  8    |
> 11.|  1    1999    2000       0              1         1         1
> 8
>  9    |
> 
> +---------------------------------------------------------------------------
> ----------------+
> 
> 
> Stata seems to count the period form 1995-1996,as a time where the
> individual is
> at risk of failing, when he is not.
> 
> 
> 
> Therefore,  am unsure as to how to st-set the data so that from 1995-1996,
> Stata
> recognizes that the individual is no longer at risk of failing and that my
> 
> analysis time can be “Reset” to 0 for when the individual begins a second
> period
> at risk after being not at risk.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index