Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Identifying and recording the first occurrence of an event by actor given category


From   Erik Aadland <erikaadland@hotmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Identifying and recording the first occurrence of an event by actor given category
Date   Fri, 7 Sep 2012 11:36:10 +0000

Thank you, Nick, for your solution and for the interesting reference!

To make the first occurrence conditional on specific category values (e.g. 2), I modified the code as follows:

egen first_1 = min(year / (event == 1 & category_id == 2)), by (actor_id)  

This modification appears to work well, too.

Kind regards,
Erik.


> Date: Fri, 7 Sep 2012 10:47:31 +0100
> Subject: Re: st: Identifying and recording the first occurrence of an event by actor given category
> From: njcoxstata@gmail.com
> To: statalist@hsphsun2.harvard.edu
> 
> Stata is great at this kind of problem. The essence of Erik's
> difficulty is the need to look in other observations for the same
> panel to produce the new variable.
> 
> First off, the first year anything occurred is just the minimum year
> anything occurred, so we can get at that minimum in several ways:
> sorting, using -summarize-, -egen- etc.
> 
> Given the panel structure, -egen- is a good tool, because functions
> that support a -by()- option or a -by:- prefix will handle panels
> separately.
> 
> Here is one solution:
> 
> egen first_1 = min(year / (event == 1)), by(actor_id)
> 
> Here is another:
> 
> egen first_1 = min(cond(event == 1, year, .)), by(actor_id)
> 
> This approach is discussed in detail within
> 
> Cox, N.J. 2011. Speaking Stata: Compared with ... Stata Journal 11(2): 305-314
> 
> Abstract. Many problems in data management center on relating values
> to values in other observations, either within a dataset as a whole or
> within groups such as panels. This column reviews some basic Stata
> techniques helpful for such tasks, including the use of subscripts,
> summarize, by:, sum(), cond(), and egen. Several techniques exploit
> the fact that logical expressions yield 1 when true and 0 when false.
> Dividing by zero to yield missings is revealed as a surprisingly
> valuable device.
> 
> Erik's question appears a bit more complicated than I have answered
> here; if there is some twist I have missed no doubt he will make that
> clear.
> 
> Nick
> 
> On Fri, Sep 7, 2012 at 10:07 AM, Erik Aadland <erikaadland@hotmail.com> wrote:
> 
> > I have an unbalanced panel dataset.
> > This is the structure:
> > actor_id year category_id event
> > 1 2000 1 .
> > 1 2000 2 1
> > 1 2001 2 1
> > 2 2003 3 .
> > 2 2003 2 1
> > 2 2004 2 .
> >
> > I want to generate a variable -first_occurrence- that identifies and records for each actor_id the first time the actor experienced event = 1 if the category = e.g. 2. I would like this -first occurrence- variable to capture the value of -year- at the time of first event occurrence. Some actors never experience event = 1.
> > For instance, if I track first occurrence by category_id = 2, this is what I look for:
> > actor_id year category_id event first_occurrence
> > 1 2000 1 . 2000
> > 1 2000 2 1 2000
> > 1 2001 2 1 2000
> > 2 2003 3 . 2003
> > 2 2003 2 1 2003
> > 2 2004 2 . 2003
> >
> > Any input or suggestions on this problem would be greatly appreciated.
> 
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index