Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Logical condition


From   wgould@stata.com (William Gould, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Logical condition
Date   Thu, 15 Oct 2009 09:40:12 -0500

Dmitriy Krichevskiy <krichevskyd@gmail.com> writes, 

> I have a large panel dataset for which I was hoping to analyze a
> particular subset of agents. More specifically:
> 
>         id time    b1  i1
>          1    1     1   3
>          1    2     1   4
>          1    3     0   2
>          2    1     0   5
>          2    2     1   6
>          2    3     0   4
>          3    1     0   2
>          3    2     1   3
>          3    3     1   1
>
> 
> I want to select only those agents from above example who have
> switched their b1 status from 0 to 1 in the first two periods (agents
> 2 and 3 above). 

One solution is, 

     1.  Make a variable that marks obs. for which time==1 & b1==0.
         The variable is equal to 1 if the statement is true, 0 if false.
         Call this variable cond1.

     2.  Make a variable that marks obs. for which time==2 & b1==2.
         Call this variable cond2.

     3.  For each id, make cond1=1 in all obs. if it is true in any obs.

     4.  For each id, make cond2=1 in all obs. if it is true in any obs.

     5.  Keep observations for which cond1 & cond2 are true.

The solution is

        gen cond1 = (time==1 & b1==0)           // (1)

        gen cond2 = (time==2 & b1==1)           // (2)

        sort id                                 // (3)
        by id: replace cond1 = sum(cond1)
        by id: replace cond1 = cond1[_N]

        by id: replace cond2 = sum(cond2)       // (4)
        by id: replace cond2 = cond2[_N]

        keep if cond1 & cond2                   // (5)

Here's more concise, equivalent code, 

        sort id
        by id: egen cond1 = max(time==1 & b1==0)
        by id: egen cond2 = max(time==2 & b1==1)
        keep if cond1 & cond2


I often use the method above.  The generic problem is

     1.  You have long data.   

     2.  You want to choose all the observations for an id 
         for which a complicated condition is true.

The obvious solution is to switch the data to wide form, but often 
it is easier to create the seperate logical variables at the 
detailed level and then convert them to 1 everywhere within id 
if they are one anywhere.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index