Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Tracking attrition in a long-shaped dataset


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Tracking attrition in a long-shaped dataset
Date   Thu, 21 Mar 2013 15:09:40 +0000

Yes, that helps considerably. Have a look at

FAQ     . . . . . . Identifying runs of consecutive observations in panel data
        . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and V. Wiggins
        8/05    How do I identify runs of consecutive observations
                in panel data?
                http://www.stata.com/support/faqs/data-management/
                identifying-runs-of-consecutive-observations/

and see how far that gets you. Advice to look at -tsspell- (SSC) is included.

(The standard advice in the Statalist FAQ is to look at the FAQs
before posting.)

Nick

On Thu, Mar 21, 2013 at 2:52 PM, Max <[email protected]> wrote:
> Hi Nick,
>
> Thanks for the quick response. Let me clarify. Month is a whole
> number, representing a time period, so a person might appear in month
> =1, month=2, month=4, but not month=3. In that case, he would have
> skipped month 3. Thus, using the code in #1 I would code him as having
> returned in the first row he appears (month=1), but not in the second
> row (month). So yes, month increments by 1, but time marches on
> regardless of whether a person appears in that month or not.
>
> Does that make more sense? If not, do let me know and I'd be happy to
> clarify further.
>
> Max
>
> On Thu, Mar 21, 2013 at 10:44 AM, Nick Cox <[email protected]> wrote:
>> What your -month- variable is holding is unclear to me.
>>
>> In terms of your questions
>>
>> #1. It is evident that
>>
>> month == month[_n+1] - 1
>>
>> is true if and only -month- increases by 1 from one observation to the next.
>>
>> It's difficult to check that against your word description. Usually
>> with panel data, there is a time variable and then all the business is
>> centred on what is or is not true at different times. Here you seem to
>> be focussing entirely on the time variable.
>>
>> #2. Understanding this depends on understanding #1, and evidently failed.
>>
>> If you don't get a better answer, you will need to ask a better question.
>>
>> Nick
>>
>> On Thu, Mar 21, 2013 at 2:19 PM, Max <[email protected]> wrote:
>>
>>> I have a long dataset of ID's (people) and months. People enter and leave
>>> the dataset at various points, and can skip months. I want to track
>>> attrition by doing two things:
>>> 1. Create a dummy variable = 1 in time t if the person appears in time t+1,
>>> 0 otherwise (but missing for the last month). I think I've solved this one,
>>> but am always curious to hear if anyone has any alternate methods that
>>> might be better. Here is my solution:
>>>
>>> bysort ID (month): gen returned = 1 if month == month[_n+1] - 1
>>>
>>> 2. And this is where I'm stuck. I want to create a dummy variable = 1 in
>>> time t if the person appears in time t+2, regardless of whether they appear
>>> in time t+1 or not. I tried:
>>>
>>> bysort ID (month): gen returned_2month = 1 if (month == month[_n+2] - 2)
>>>
>>> But that didn't work because someone who, say, appears in months 1 and 3
>>> will not have an entry for month[_n+2]. But they should in fact be coded as
>>> a 1.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index