Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: creating variable summarizing for each individual properties of other members of a group at t-1


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: creating variable summarizing for each individual properties of other members of a group at t-1
Date   Fri, 20 May 2011 09:35:50 +0100

This could be done with calls to -by:-, -egen- and clever use of _n and/or _N.

But the most Stataish way is to -collapse- to a dataset based on
years, with (presumably) the last month in each year being used. Then
do your manipulations in that dataset and -merge- back with the
monthly dataset.

Nick

On Fri, May 20, 2011 at 8:41 AM, Erik Aadland <[email protected]> wrote:
> For the purpose of sensitivity analysis, I would like to adjust the suggested code below to create an alternative "peer entry" variable based on year instead of yearmonth, so that the prevscore2 value in the suggested code generates the number of peer entrants up until and including year-1 (instead of a yearmonthly summing and score at yearmonth-1). I tried to replace yearmonth with year in the suggested code, but this does not work as the code sums up the scores for each ind_id in all the yearmonths within a given year in the yearly sum.
>
> The example output below is based on my replacing yearmonth with year in the suggested code. This replacing is an inadequate solution. Ind_d 2 and 4 get a prevscore e.g. for 2003 of 3 and 2 respectively, when it should be 1 for ind_2 and 0 for ind_id 4:
>
> year   month   yearmonth   ind_id   ind_entry   ind_score2   all_score2   prevscore2
> 2003   10      10          2        0           0            3
> 2003   11      11          2        0           0            3            3
> 2003   12      12          2        0           0            3            3
> 2004   1       13          2        0           0            5            3
> 2004   2       14          2        1           1            4            5
> 2004   3       15          2        0           1            4            4
> 2003   10      10          4        1           1            2
> 2003   11      11          4        0           1            2            2
> 2003   12      12          4        1           1            2            2
> 2004   1       13          4        0           1            4            2
> 2004   2       14          4        0           1            4            4
> 2004   3       15          4        0           1            4            4
>
> Is there a smart way to adjust the code to account for the change in focal time unit?
>
> Sincerely,
>
> Erik Aadland.
>
>
>
> ----------------------------------------
>> From: [email protected]
>> To: [email protected]
>> Date: Thu, 19 May 2011 13:34:33 +0100
>> Subject: RE: st: creating variable summarizing for each individual properties of other members of a group at t-1
>>
>> That's much clearer to me, or rather I now realise some stupid misunderstandings of your earlier posts.
>>
>> I've used different variable names. I now suggest this.
>>
>> bysort ind_id (yearmonth) : gen ind_score2 = min(sum(ind_entry), 1)
>> egen all_score2 = total(ind_score2), by(yearmonth)
>> replace all_score2 = all_score2 - ind_score2
>> bysort ind_id (yearmonth) : gen prevscore2 = all_score2[_n-1]
>>
>> Notable that the code, which reproduces your CORRECT, is much simpler than earlier bad versions.
>>
>> Nick
>> [email protected]
>>
>> Erik Aadland
>>
>> I will try to explain why the suggested code below does not solve my second problem and what it gets wrong. The suggested code below does not solve my second problem because the all_score for a given ind_id includes ind_entry = 1 contribution for that same ind_id. I need the variable to sum the number of peer entrants (sum of unique ind_ids excluding the focal ind_id) over yearmonths for each ind_id. Once an ind_d has experienced ind_entry=1, the ind_id is considered an entrant and subsequent ind_entries = 1 for that ind_id does not change that. Given the code below, an ind_id gets an all_score and prevscore that includes their own entry. It seems problematic to me to consider an ind_id to be a peer to him or herself.
>>
>> See the resulting output below. I have entered an additional column to the right indicating the correct prevscore for each ind_id.
>>
>> year month yearmonth ind_id ind_entry ind_score all_score prevscore CORRECT prevscore
>> 2003 10 10 2 0 0 1
>> 2003 11 11 2 0 0 1 1 1
>> 2003 12 12 2 0 0 1 1 1
>> 2004 1 13 2 0 0 1 1 1
>> 2004 2 14 2 1 1 1 1 1
>> 2004 3 15 2 0 0 2 1 1
>> 2003 10 10 4 1 1 0
>> 2003 11 11 4 0 0 1 0 0
>> 2003 12 12 4 1 0 1 1 0
>> 2004 1 13 4 0 0 1 1 0
>> 2004 2 14 4 0 0 2 1 0
>> 2004 3 15 4 0 0 2 2 1
>>
>> Sincerely,
>>
>> Erik Aadland.
>>
>>
>>
>> > From: [email protected]
>> > To: [email protected]
>> > Date: Wed, 18 May 2011 18:36:37 +0100
>> > Subject: RE: st: creating variable summarizing for each individual properties of other members of a group at t-1
>> >
>> > My code
>> >
>> > bysort ind_id (yearmonth) : gen ind_score = sum(ind_entry)
>> > by ind_id : replace ind_score = ind_score == 1 & ind_score[_n-1] != 1
>> > sort yearmonth
>> > gen all_score = sum(ind_score)
>> > by yearmonth : replace all_score = all_score[_N]
>> > replace all_score = all_score - ind_score
>> > bysort ind_id (yearmonth) : gen prevscore = allscore[_n-1]
>> >
>> > was certainly intended to solve your second problem. I've not tested it. Are you saying it doesn't? And if it doesn't what does it get wrong?
>> >
>> > Nick
>> > [email protected]
>> >
>> > Erik Aadland
>> >
>> > Thank you Nick and Jorge for your suggestions. They were very helpful, and I am very grateful.
>> >
>> > Jorge, your suggested code below worked perfectly for my "first" variable.
>> >
>> > I am still struggling with my "second" variable. In the "second" variable, I am trying to create a variable that for each ind_id counts the total number of other ind_ids, excluding the focal ind_id, in the dataset that have experienced ind_entry =1 at least once up until and including yearmonth -1. In other words, I am trying to create a variable that for each individual tracks the number of other entrants in the dataset up until and including yearmonth -1. I am trying to track ind_ids that have entered, not how many times they have entered.
>> >
>> > Any and all input on this problem would be very much appreciated.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index