Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: generating variables based on the co-occurrence of ids in groups over time


From   Erik Aadland <erikaadland@hotmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: generating variables based on the co-occurrence of ids in groups over time
Date   Wed, 7 Mar 2012 14:19:25 +0000

Thank you so much, Nick.
 
I will need to spend some time attempting to penetrate this (for me) rather advanced code.
 
Sincerely,
 
Erik.



> From: n.j.cox@durham.ac.uk
> To: statalist@hsphsun2.harvard.edu
> Date: Wed, 7 Mar 2012 12:17:15 +0000
> Subject: st: RE: generating variables based on the co-occurrence of ids in groups over time
> 
> Here are some doodlings: 
> 
> tab ind_id, gen(ind_id)
> drop ind_id 
> foreach v of var ind_id* {
> local call `call' (sum) `v'
> }
> collapse `call', by(year project_id)
> l
> egen count_id = rowtotal(ind_id*)
> unab ind_id : ind_id*
> local ind_id : subinstr local ind_id "ind_id" "", all
> foreach id of local ind_id {
> gen collab`id' = count_id - ind_id`id' if ind_id`id' == 1
> }
> edit 
> 
> Not a complete solution, but may help. 
> 
> 
> Nick 
> n.j.cox@durham.ac.uk 
> 
> 
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Erik Aadland
> Sent: 07 March 2012 11:15
> To: statalist@hsphsun2.harvard.edu
> Subject: st: generating variables based on the co-occurrence of ids in groups over time
> 
> Dear Statalist.
> 
> I am struggling to generate two variables based on the co-occurrence of ind_ids in project_ids over time (yearmonth).
> 
> Structure of my data is as follows:
> 
> yearmonth project_id ind_id
> 5 1 1
> 5 1 2
> 5 1 3
> 5 2 1
> 5 2 4
> 5 2 5
> 6 3 1
> 6 3 2
> 6 3 5
> 6 4 4
> 6 4 5
> 6 4 6
> 7 5 1
> 7 5 4
> 7 5 5
> 7 5 2
> 
> 
> The two variables I need to generate are:
> 
> X (no. of prior collaborators in project for each ind_id): how many of the other individuals in project_id each ind_id has previously collaborated with (i.e. how many of the other ind_ids in the current project that each focal ind_id has co-occurred with in other projects in previous yearmonths) 
> 
> Z (total prior collaborations in project for each ind_id): the total number of times each ind_id has previously collaborated with the given other individuals in project_id (i.e. the total number of times each focal ind_id has co-occurred with other ind_ids in the current project in previous yearmonths)
> 
> I have added varible X and Z scores to the data structure example below:
> 
> yearmonth project_id ind_id X Z 
> 5 1 1 
> 5 1 2 
> 5 1 3 
> 5 2 1 
> 5 2 4 
> 5 2 5 
> 6 3 1 2 2
> 6 3 2 1 1
> 6 3 5 1 1
> 6 4 4 1 1
> 6 4 5 1 1
> 6 4 6 0 0
> 7 5 1 3 5
> 7 5 4 2 3
> 7 5 5 3 5
> 7 5 2 2 3
> 
> 
> Any and all input to these problems would be greatly appreciated.
> 
> I use Stata 10 and the panel data is unbalanced.
> 
> 
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index