Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Identifying and recording the first occurrence of an event by actor given category |

Date |
Fri, 7 Sep 2012 10:47:31 +0100 |

Stata is great at this kind of problem. The essence of Erik's difficulty is the need to look in other observations for the same panel to produce the new variable. First off, the first year anything occurred is just the minimum year anything occurred, so we can get at that minimum in several ways: sorting, using -summarize-, -egen- etc. Given the panel structure, -egen- is a good tool, because functions that support a -by()- option or a -by:- prefix will handle panels separately. Here is one solution: egen first_1 = min(year / (event == 1)), by(actor_id) Here is another: egen first_1 = min(cond(event == 1, year, .)), by(actor_id) This approach is discussed in detail within Cox, N.J. 2011. Speaking Stata: Compared with ... Stata Journal 11(2): 305-314 Abstract. Many problems in data management center on relating values to values in other observations, either within a dataset as a whole or within groups such as panels. This column reviews some basic Stata techniques helpful for such tasks, including the use of subscripts, summarize, by:, sum(), cond(), and egen. Several techniques exploit the fact that logical expressions yield 1 when true and 0 when false. Dividing by zero to yield missings is revealed as a surprisingly valuable device. Erik's question appears a bit more complicated than I have answered here; if there is some twist I have missed no doubt he will make that clear. Nick On Fri, Sep 7, 2012 at 10:07 AM, Erik Aadland <erikaadland@hotmail.com> wrote: > I have an unbalanced panel dataset. > This is the structure: > actor_id year category_id event > 1 2000 1 . > 1 2000 2 1 > 1 2001 2 1 > 2 2003 3 . > 2 2003 2 1 > 2 2004 2 . > > I want to generate a variable -first_occurrence- that identifies and records for each actor_id the first time the actor experienced event = 1 if the category = e.g. 2. I would like this -first occurrence- variable to capture the value of -year- at the time of first event occurrence. Some actors never experience event = 1. > For instance, if I track first occurrence by category_id = 2, this is what I look for: > actor_id year category_id event first_occurrence > 1 2000 1 . 2000 > 1 2000 2 1 2000 > 1 2001 2 1 2000 > 2 2003 3 . 2003 > 2 2003 2 1 2003 > 2 2004 2 . 2003 > > Any input or suggestions on this problem would be greatly appreciated. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Identifying and recording the first occurrence of an event by actor given category***From:*Erik Aadland <erikaadland@hotmail.com>

**References**:**st: Identifying and recording the first occurrence of an event by actor given category***From:*Erik Aadland <erikaadland@hotmail.com>

- Prev by Date:
**st: Interaction terms in fixed effects model** - Next by Date:
**Re: st: create pretty charts** - Previous by thread:
**st: Identifying and recording the first occurrence of an event by actor given category** - Next by thread:
**RE: st: Identifying and recording the first occurrence of an event by actor given category** - Index(es):