Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Generating an indicator of occurance of an event during the interval


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: Generating an indicator of occurance of an event during the interval
Date   Fri, 15 Dec 2006 16:36:19 -0000

Thanks. 

Actually, there's at least one bug. 

Consider 

by id : gen indicator = four[_n - 1] > 0 if outcome == 2

and what happens at the first member of each panel, for
which _n == 1. Thus _n - 1 == 0. -four[0]- will always
treated as missing by Stata, and thus is > 0, which 
will be relevant if the first -outcome == 2-. 

Well, if it's the first member of the panel, evidently 
we know nothing about anything previous. So the code should 
be 

by id : gen indicator = (four[_n - 1] > 0) if outcome == 2 & _n > 1 

to ensure that the indicator is always missing for the first 
member. 

I parenthesised 

(four[_n-1] > 0) 

to underline that it's the expression that counts here; that is, 
Stata will evaluate this as 1 or 0 to get the indicator desired, 
although only if the -if- condition is satisfied. 

Nick 
[email protected] 

Sergio Correia
 
> Interesting answer. Much more bug-free.
> 
> By the way, adding -bys id (eventid) : - to each line makes the code
> panel-ready so it's not that much of a problem.

> On 12/15/06, Nick Cox <[email protected]> wrote:

> > I haven't tried understanding Sergio's code, as it
> > appears to take no account of the fact that this is panel data
> > and so calculations must be done separately for each
> > identifier.
> >
> > Consider starting with -id-, -eventid- and -outcome-. Then
> >
> > bysort id (eventid) : gen order = sum(outcome == 2) * (outcome == 2)
> >
> > or
> >
> > bysort id (eventid) : gen order = cond(outcome == 2, 
> sum(outcome == 2), 0)
> >
> > -outcome == 2- evaluates as 1 or 0 depending on whether it is true
> > or false, and the -sum()- gives you the cumulative sum.
> >
> > Furthermore each occurrence of -outcome == 2- defines a new
> > "spell":
> >
> > by id : gen spell = sum(outcome == 2)
> >
> > bysort id spell (eventid) : gen four = sum(outcome == 4)
> > by id spell : replace four = four[_N]
> >
> > by id : gen indicator = four[_n - 1] > 0 if outcome == 2
> >
> > The "spell" point of view is simple but gives you a handle
> > on many problems in this territory. It is really piggy-backing
> > on -by:-. For spell incantations, see -tsspell- on SSC and
> > its quite detailed help file. For -by:- explanations, -search
> > by- and follow up manual and Stata Journal references.
> >
> > As Sergio says, -egen- is another route here but a path from
> > first principles is always instructive.

Sergio Correia

> > > This works but I'm pretty sure its not the best way:
> > >
> > > * CODE:
> > > gen xyz = outcome==4
> > > replace xyz = (xyz[_n-1]==1 | xyz ==1) & (outcome!=2)
> > > gen indicator = xyz[_n-1]==1 if order>1
> > >
> > >
> > > Line 1 is straightforward.
> > > Line 2 is 1 if there has been an outcome of 4 since the 
> last success
> > > (and we are not on a successful outcome)
> > > Line 3 is also simple
> > >
> > > The last two lines can be merged but that would make them 
> harder to
> > > understand. Again, I'm sure there are better answers 
> (maybe egen, sum
> > > or more complex gens).
> >
> > Le Wang
> >
> > > > I have a data set containing four variables
> > > >
> > > > (1) household id (2) event id (3) event outcome (4) 
> order of success
> > > >
> > > > event outcomes can takes on values of 1,2,3,4; if the event
> > > outcome is
> > > > 2, it is successful and ordered according to the timing 
> of occurance
> > > > of the success (recorded in the fourth variable "order 
> of success").
> > > > The data looks like what follows,
> > > >
> > > >
> > > --------------------------------------------------------------
> > > ---------------------------
> > > > id      eventid outcome order of success
> > > > 1       1       1       0
> > > > 1       2       2       1
> > > > 1       3       4       0
> > > > 1       4       2       2
> > > > 2       1       2       1
> > > > 2       2       4       0
> > > > 2       3       4       0
> > > > 2       4       3       0
> > > > 2       5       2       2
> > > > 3       1       2       1
> > > > 3       2       2       2
> > > > 3       3       1       0
> > > > 3       4       4       0
> > > > 3       5       2       3
> > > > .
> > > > .
> > > > .
> > > > .
> > > >
> > > --------------------------------------------------------------
> > > ---------------------------
> > > >
> > > > What I wanna do is to create a variable for obs with 
> the order of
> > > > success greater than 1; this variable indicates whether 
> or not there
> > > > exists an event outcome equal to 4 during the interval 
> between this
> > > > success and the previous success. The final data for the example
> > > > should look like the following
> > > >
> > > >
> > > --------------------------------------------------------------
> > > ---------------------------
> > > >
> > > > id      eventid outcome order of success        indicator
> > > > 1       1       1       0       .
> > > > 1       2       2       1       .
> > > > 1       3       4       0       .
> > > > 1       4       2       2       1
> > > > 2       1       2       1       .
> > > > 2       2       4       0       .
> > > > 2       3       4       0       .
> > > > 2       4       3       0       .
> > > > 2       5       2       2       1
> > > > 3       1       2       1       .
> > > > 3       2       2       2       0
> > > > 3       3       1       0       .
> > > > 3       4       4       0       .
> > > > 3       5       2       3       1
> > > > .
> > > > .
> > > > .
> > > > .

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index