Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)


From   Erik Aadland <erikaadland@hotmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
Date   Wed, 20 Feb 2013 12:52:03 +0000

Thanks again, Nick.
This is very helpful.
Kind regards,
Erik.

----------------------------------------
> Date: Wed, 20 Feb 2013 11:03:43 +0000
> Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
> From: njcoxstata@gmail.com
> To: statalist@hsphsun2.harvard.edu
>
> I added some commentary below.
>
> On Wed, Feb 20, 2013 at 8:20 AM, Erik Aadland <erikaadland@hotmail.com> wrote:
> > Thank you so much, Nick.
> > The code appears to work perfectly.
> > I will compare this code to the previous code for the related measure and do my best to absorb what is going on.
> > Kind regards,
> > Erik.
> >
> > ----------------------------------------
> >> Date: Tue, 19 Feb 2013 19:34:59 +0000
> >> Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
> >> From: njcoxstata@gmail.com
> >> To: statalist@hsphsun2.harvard.edu
> >>
> >> I don't know about smart, but this seems the same kind of problem.
> >> Change the order of the loops and let the list of colleagues
> >> accumulate from year to year for each actorr.
> >>
> >> Have a look at this code
>
> Comment #1.
>
> The first step is just to create a small toy or sandpit dataset for
> which results are easy to derive. Erik provided this dataset himself
> and it's always a good idea. Naturally, the full dataset might expose
> problems not in the toy dataset, but one problem at a time....
>
> >> clear
> >> input year project_id actor_id condition
> >> 2000 1 1 1
> >> 2000 2 1 1
> >> 2000 1 2 0
> >> 2000 2 2 0
> >> 2000 1 3 0
> >> 2000 2 3 0
> >> 2000 3 4 1
> >> 2000 3 5 0
> >> 2000 3 6 0
> >> 2000 4 7 0
> >> 2001 5 1 1
> >> 2001 5 2 0
> >> 2001 6 2 0
> >> 2001 5 3 0
> >> 2001 6 3 0
> >> 2001 5 4 1
> >> 2001 6 4 1
> >> 2001 7 5 0
> >> 2001 7 6 0
> >> 2001 7 7 0
> >> 2001 8 8 0
> >>
> >> end
>
> Comment #2.
>
> I create variables using -egen-'s -group()- function that by
> construction run 1 ... # of distinct values. Then I can pick up the
> number of distinct values from -summarize, meanonly-. The maximum
> group identifier is the number required. This is no more than
> convenience, to make the loops to come very easy, but convenience
> beats its opposite.
>
> >> egen proj = group(project_id year), label
> >> su proj, meanonly
> >> local nproj = r(max)
> >>
> >> egen act = group(actor_id), label
> >> su act, meanonly
> >> local nact = r(max)
> >>
> >> egen yr = group(year), label
> >> su yr, meanonly
> >> local nyr = r(max)
>
> Comment #3.
>
> I initialise a counter variable. In essence we assume no co-actors,
> unless we find some, in which case we will change the counter. Often
> this command is inserted once you realise that the strategy is going
> to be
>
> Loop:
> Look at each possibility and work out the result.
> Put the result for that possibility in an existing variable.
>
> The second implies a -replace-, but that in turn requires a previous
> -generate- ahead of the loops.
>
> >> gen mywanted2 = 0
>
> Comment #4.
>
> Now the slope gets steeper! The main difficulty of the problem is the
> need to look in a group of other observations for co-actors. I went
> for list manipulation. -levelsof- gives you overall lists and the rest
> is looping over possibilities. Stuff discussed at -help macrolists- is
> invaluable.
>
> There are yet other possibilities, e.g. it is an open question whether
> you would be better off with a different data structure. If the number
> of actors on a project is small and their identifiers are of simple
> form, then all the identifiers could be stored as values of a string
> variable such as "1 3 5 8" and you could then treat the identifiers
> using -word()- and -wordcount()-. A wild guess is that this makes some
> things easier and some more difficult.
>
> >> * lists of those in each project and year and condition == 0
> >> qui forval p = 1/`nproj' {
> >> levelsof act if proj == `p' & condition == 0, local(who`p')
> >> }
> >>
> >> macro list
> >>
> >> * now cycle over actors
> >> qui forval a = 1/`nact' {
> >>
> >> * blank out workspace
> >> local work
> >>
> >> * cycle over years
> >> qui forval y = 1/`nyr' {
> >>
> >> * if actor was included, we want to add that list to workspace
> >> forval p = 1/`nproj' {
> >> count if act == `a' & proj == `p' & yr == `y'
> >> if r(N) local work `work' `who`p''
> >> }
> >>
> >> * remove duplicates
> >> local work : list uniq work
> >> * remove this actor
> >> local work : list work - a
> >> * see what we got for debugging
> >> noi di "`a' `work'"
> >>
> >> replace mywanted2 = `: list sizeof work' if act == `a' & yr == `y'
> >> }
> >> }
>
> On Tue, Feb 19, 2013 at 4:35 PM, Erik Aadland <erikaadland@hotmail.com> wrote:
>
> >> > A while back I got assistance from the list for making a separate count, for each actor_id and year, the number of distinct other actors that met a certain condition that the actor_id had occurred together with in projects.
> >> > Nick Cox suggested the code below that worked wonderfully.
> >> > This code generates a separate count for each actor_id and year.
> >> > I now face a new challenge. I would like to generate a similar measure, that makes a cumulative count over each year (rather than for each year). So, if actor_id == 1 collaborated with 2 other distinct actors in 2000, the score for actor_id == 1 would be 2 in 2000. If actor_id == 1 collaborated with one additional distinct actor that met the condition in 2001, the score would increase to 3 in 2001 (if the disctinct actors already counted in the 2000 score were present in projects together with the actor_id in 2001 as well they would not be counted again in 2001).
> >> > Is there a smart way to change the code below to generate this new measure?
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index