Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Capturing the date and which something first occurs


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Capturing the date and which something first occurs
Date   Sun, 8 Apr 2012 15:28:03 +0100

The question raised by Richard is wanting an indicator variable for
the first occurrence of an event. At its simplest there is a -state-
indicator  and we are looking for the first time that -state == 1-.

The approach in the FAQ I wrote on this question

http://www.stata.com/support/faqs/data/firstoccur.html

is I now think too indirect. I would now urge focus on finding the
date of the first occurrence and an indicator is then just given by
when the date variable is equal to that first date.

The first date is just the minimum and we can get that easily, even
with panel data, using -egen-:

egen first_date = min(date / (state == 1)), by(id)

or

egen first_date = min(cond(state == 1, date, .)), by(id)

The expressions fed to the -min()- function of -egen- are

date / (state == 1)

cond(state == 1, date, .)

They are equivalent and are both focused on getting -egen- to ignore
everything except the times when -state == 1-. If there are no such
times then the expressions become missing, which in turn gives the
right answer.

Although for simplicity we are in this example looking for -state ==
1-, i.e. values of 1 for an indicator for what interests us, that is
just detail. The meain idea generalises easily to any true-or-false
condition:

egen first_date = min(date / (foo == 42)), by(id)

Another nice feature about this approach is that it extends easily to
giving the _last_ date, using -max()- as the -egen- function.

Nick

On Sat, Apr 7, 2012 at 7:21 AM, Nick Cox <njcoxstata@gmail.com> wrote:

> Also consider
>
> gsort id -state time
> by id: gen date_first = time[1] if state[1] == 1
> gen is_first = time == date_first

 On Fri, Apr 6, 2012 at 6:41 PM, Nick Cox <njcoxstata@gmail.com> wrote:

> When the first occurrence occurred is discussed in the same FAQ
>>
>> http://www.stata.com/support/faqs/data/firstoccur.html
>>
>> Here is another way to do it.
>>
>> egen date_first = min(time / state), by(id)
>>
>> Explanation: This is the trick that dividing by zero can be useful.
>> time / 0 is returned as missing and thus ignored in the calculation of
>> the minimum, as long as the -state- did occur.

On Fri, Apr 6, 2012 at 6:28 PM, Richard T. Campbell <dcamp@uic.edu> wrote:

>>> Suppose I have a data set like that used by Nick Cox in an FAQ which shows
>>> how to capture a
>>> record at which something first occurs. Here is his example.
>>>
>>>
>>>     +---------------------------+
>>>     | id   time   state   first |
>>>     |---------------------------|
>>>  1. |  1      1       0       0 |
>>>  2. |  1      2       0       0 |
>>>  3. |  1      3       0       0 |
>>>  4. |  1      4       1       1 |
>>>  5. |  1      5       1       0 |
>>>  6. |  1      6       1       0 |
>>>  7. |  1      7       1       0 |
>>>  8. |  1      8       1       0 |
>>>  9. |  1      9       1       0 |
>>>  10. |  1     10       1       0 |
>>>     |---------------------------|
>>>  11. |  2      2       0       0 |
>>>  12. |  2      2       0       0 |
>>>  13. |  2      3       1       1 |
>>>  14. |  2      4       1       0 |
>>>  15. |  2      5       1       0 |
>>>  16. |  2      6       1       0 |
>>>  17. |  2      7       1       0 |
>>>  18. |  2      8       1       0 |
>>>  19. |  2      9       0       0 |
>>>  20. |  2     10       0       0 |
>>>     |---------------------------|
>>>  21. |  3      1       0       0 |
>>>  22. |  3      2       1       1 |
>>>  23. |  3      3       0       0 |
>>>     +---------------------------+
>>>
>>> So, for ID 1, the first time at which state = 1 occurs is the fourth record,
>>> for
>>> ID 2 it is the third record etc. I want to assign a value within an id equal
>>> to
>>> that index. For example, for ID 1 I want a variable that equals 4 for all
>>> ten cases, for ID 2 a variable equal to 3 for all cases etc. Put
>>> differently,
>>> I want to assign to all cases within an id, the value of _n when first = 1.
>>> I can't seem to get my head around how to do this.
>>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index