Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1


From   Erik Aadland <erikaadland@hotmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1
Date   Tue, 17 May 2011 18:13:45 +0000

Thank you so much for your input, Nick.
 
I have experimented and generated different variables previously relying on the very helpful FAQ in question.
 
I am struggling with this problem however. When I apply the suggested code below, it appears that the calculation of the number of peers adds up more than it should for ind_ids with more ind_entry = 1 relative to other ind_ids, and consequently contribute more to the "score" than those with fewer ind_entry = 1.
 
Referring to the example dataset, ind_id 2 is given the correct "prevscore". Ind_id 4, however, is not. By yearmonth 12, ind_id 4 has contributed 2 ind_entry=1 to the "score", which is correct for ind_id 2. However, ind_id 2 has not yet experienced ind_entry=1. Consequently, score - 1 for ind_id 4 yields a score = 1 in yearmonth 11 and 12, when the correct score = 0. And so on.
 
Here is the suggested code as I applied it: 
 
clear ;
#delimit ;
use "ind_entry_ex.dta" ;
sort yearmonth ;
gen score = sum(ind_entry) ;
by yearmonth: replace score = score[_N] ;
replace score = score - ind_entry ;
bysort ind_id (yearmonth): gen prevscore = score[_n-1] ;

 
Here is the output:
 
year month yearmonth ind_id ind_entry score prevscore
2003 10    10        2       0        1 
2003 11    11        2       0        1     1
2003 12    12        2       0        2     1
2004 1     13        2       0        2     2
2004 2     14        2       1        2     2
2004 3     15        2       0        3     2
2003 10    10        4       1        0 
2003 11    11        4       0        1     0
2003 12    12        4       1        1     1
2004 1     13        4       0        2     1
2004 2     14        4       0        3     2
2004 3     15        4       0        3     3

I use Stata 10.
 
 
Thanks again and kind regards,
 
Erik.


----------------------------------------
> From: n.j.cox@durham.ac.uk
> To: statalist@hsphsun2.harvard.edu
> Date: Tue, 17 May 2011 17:47:53 +0100
> Subject: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1
>
> I don't know what "I am familiar with" means here. Does it mean that you've read the FAQ but can't see how to apply it?
>
> This sounds to me like
>
> 1. Get the sum of all individual entries
>
> sort yearmonth
> gen score = sum(ind_entry)
> by yearmonth : replace score = score[_N]
>
> 2. Subtract this individual
>
> replace score = score - ind_entry
>
> 3. Look one step back in time
>
> bysort ind_id (yearmonth) : gen prevscore = score[_n-1]
>
> Nick
> n.j.cox@durham.ac.uk
>
> Erik Aadland
>
> I need to create a variable that sums for each individual in my dataset the total number of ind_entry of all other individuals at time: yearmonth - 1.
> I have attached a small ex of my data structure below. So for instance, given the small dataset below, for ind_id 2 in yearmonth 11 this variable score = 1. But for ind_id 4 in the same yearmonth, the score = 0.
>
> I would also like to generate a variable that identifies for each individual the unique number of other individuals in the dataset that have experienced ind_entry = 1 at least once up until time: yearmonth - 1.
>
> I am familiar with the following FAQ: http://www.stata.com/support/faqs/data/members.html
>
> My data structure is snapshot data in principle like the example below, but some individuals enter the observation window later than others (i.e. in later yearmonths):
>
> year month yearmonth ind_id ind_entry
> 2003 10 10 2 0
> 2003 11 11 2 0
> 2003 12 12 2 0
> 2004 1 13 2 0
> 2004 2 14 2 1
> 2004 3 15 2 0
> 2003 10 10 4 1
> 2003 11 11 4 0
> 2003 12 12 4 1
> 2004 1 13 4 0
> 2004 2 14 4 0
> 2004 3 15 4 0
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index