Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Capturing the date and which something first occurs |

Date |
Sun, 8 Apr 2012 15:28:03 +0100 |

The question raised by Richard is wanting an indicator variable for the first occurrence of an event. At its simplest there is a -state- indicator and we are looking for the first time that -state == 1-. The approach in the FAQ I wrote on this question http://www.stata.com/support/faqs/data/firstoccur.html is I now think too indirect. I would now urge focus on finding the date of the first occurrence and an indicator is then just given by when the date variable is equal to that first date. The first date is just the minimum and we can get that easily, even with panel data, using -egen-: egen first_date = min(date / (state == 1)), by(id) or egen first_date = min(cond(state == 1, date, .)), by(id) The expressions fed to the -min()- function of -egen- are date / (state == 1) cond(state == 1, date, .) They are equivalent and are both focused on getting -egen- to ignore everything except the times when -state == 1-. If there are no such times then the expressions become missing, which in turn gives the right answer. Although for simplicity we are in this example looking for -state == 1-, i.e. values of 1 for an indicator for what interests us, that is just detail. The meain idea generalises easily to any true-or-false condition: egen first_date = min(date / (foo == 42)), by(id) Another nice feature about this approach is that it extends easily to giving the _last_ date, using -max()- as the -egen- function. Nick On Sat, Apr 7, 2012 at 7:21 AM, Nick Cox <njcoxstata@gmail.com> wrote: > Also consider > > gsort id -state time > by id: gen date_first = time[1] if state[1] == 1 > gen is_first = time == date_first On Fri, Apr 6, 2012 at 6:41 PM, Nick Cox <njcoxstata@gmail.com> wrote: > When the first occurrence occurred is discussed in the same FAQ >> >> http://www.stata.com/support/faqs/data/firstoccur.html >> >> Here is another way to do it. >> >> egen date_first = min(time / state), by(id) >> >> Explanation: This is the trick that dividing by zero can be useful. >> time / 0 is returned as missing and thus ignored in the calculation of >> the minimum, as long as the -state- did occur. On Fri, Apr 6, 2012 at 6:28 PM, Richard T. Campbell <dcamp@uic.edu> wrote: >>> Suppose I have a data set like that used by Nick Cox in an FAQ which shows >>> how to capture a >>> record at which something first occurs. Here is his example. >>> >>> >>> +---------------------------+ >>> | id time state first | >>> |---------------------------| >>> 1. | 1 1 0 0 | >>> 2. | 1 2 0 0 | >>> 3. | 1 3 0 0 | >>> 4. | 1 4 1 1 | >>> 5. | 1 5 1 0 | >>> 6. | 1 6 1 0 | >>> 7. | 1 7 1 0 | >>> 8. | 1 8 1 0 | >>> 9. | 1 9 1 0 | >>> 10. | 1 10 1 0 | >>> |---------------------------| >>> 11. | 2 2 0 0 | >>> 12. | 2 2 0 0 | >>> 13. | 2 3 1 1 | >>> 14. | 2 4 1 0 | >>> 15. | 2 5 1 0 | >>> 16. | 2 6 1 0 | >>> 17. | 2 7 1 0 | >>> 18. | 2 8 1 0 | >>> 19. | 2 9 0 0 | >>> 20. | 2 10 0 0 | >>> |---------------------------| >>> 21. | 3 1 0 0 | >>> 22. | 3 2 1 1 | >>> 23. | 3 3 0 0 | >>> +---------------------------+ >>> >>> So, for ID 1, the first time at which state = 1 occurs is the fourth record, >>> for >>> ID 2 it is the third record etc. I want to assign a value within an id equal >>> to >>> that index. For example, for ID 1 I want a variable that equals 4 for all >>> ten cases, for ID 2 a variable equal to 3 for all cases etc. Put >>> differently, >>> I want to assign to all cases within an id, the value of _n when first = 1. >>> I can't seem to get my head around how to do this. >>> * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Capturing the date and which something first occurs***From:*"Richard T. Campbell" <dcamp@uic.edu>

**Re: st: Capturing the date and which something first occurs***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Capturing the date and which something first occurs***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**RE: st: simultaneous equations with panel data, endogenous regressors and** - Next by Date:
**Re: Re: st: new outreg and error code r(3499)** - Previous by thread:
**Re: st: Capturing the date and which something first occurs** - Next by thread:
**st: Remedy for serial correlation in Panel Data** - Index(es):