Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: generating variables based on the co-occurrence of ids in groups over time


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: generating variables based on the co-occurrence of ids in groups over time
Date   Wed, 7 Mar 2012 12:17:15 +0000

Here are some doodlings: 

tab ind_id, gen(ind_id)
drop ind_id 
foreach v of var ind_id* {
	local call `call' (sum) `v'
}
collapse `call', by(year project_id)
l
egen count_id = rowtotal(ind_id*)
unab ind_id : ind_id*
local ind_id : subinstr local ind_id "ind_id" "", all
foreach id of local ind_id {
	gen collab`id' = count_id - ind_id`id' if ind_id`id' == 1
}
edit 

Not a complete solution, but may help. 


Nick 
n.j.cox@durham.ac.uk 


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Erik Aadland
Sent: 07 March 2012 11:15
To: statalist@hsphsun2.harvard.edu
Subject: st: generating variables based on the co-occurrence of ids in groups over time

Dear Statalist.
 
I am struggling to generate two variables based on the co-occurrence of ind_ids in project_ids over time (yearmonth).
 
Structure of my data is as follows:
 
yearmonth    project_id    ind_id
5            1             1
5            1             2
5            1             3
5            2             1
5            2             4
5            2             5
6            3             1
6            3             2
6            3             5
6            4             4
6            4             5
6            4             6
7            5             1
7            5             4
7            5             5
7            5             2

 
The two variables I need to generate are:
 
X (no. of prior collaborators in project for each ind_id): how many of the other individuals in project_id each ind_id has previously collaborated with (i.e. how many of the other ind_ids in the current project that each focal ind_id has co-occurred with in other projects in previous yearmonths) 
 
Z (total prior collaborations in project for each ind_id): the total number of times each ind_id has previously collaborated with the given other individuals in project_id (i.e. the total number of times each focal ind_id has co-occurred with other ind_ids in the current project in previous yearmonths)
 
I have added varible X and Z scores to the data structure example below:
 
yearmonth    project_id    ind_id    X    Z 
5            1             1  
5            1             2  
5            1             3  
5            2             1  
5            2             4  
5            2             5  
6            3             1         2    2
6            3             2         1    1
6            3             5         1    1
6            4             4         1    1
6            4             5         1    1
6            4             6         0    0
7            5             1         3    5
7            5             4         2    3
7            5             5         3    5
7            5             2         2    3


Any and all input to these problems would be greatly appreciated.
 
I use Stata 10 and the panel data is unbalanced.
 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index