Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)


From   Erik Aadland <erikaadland@hotmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
Date   Tue, 16 Oct 2012 10:44:05 +0000

Thank you Nick!
You are quite right. I was imprecise; it is distinct actors I want to capture.
When I run your suggested code, I get this error message after the following line of code:

qui forval a = 1/`nact' { 
unexpected end of file
r(612);

What could possibly cause this error message? I am using Stata 10.
Thanks again and kind regards,
Erik.


----------------------------------------
> Date: Tue, 16 Oct 2012 11:19:41 +0100
> Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
> From: njcoxstata@gmail.com
> To: statalist@hsphsun2.harvard.edu
>
> First off, on my list of hobby-horses is a prejudice that the word
> "unique" is misused here, although you are in very good company:
> StataCorp itself does it in various places, e.g. -codebook-, although
> I am working on changing their habits if I can. The word "unique"
> strictly means occurring once only; I recommend the word "distinct"
> for what you want. There is a longer discussion of terminology in
>
> SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations
> (help distinct if installed) . . . . . . N. J. Cox and G. M. Longton
> Q4/08 SJ 8(4):557--568
> shows how to answer questions about distinct observations
> from first principles; provides a convenience command
>
> That said, when faced with a problem like yours, vague ideas of
> possible solutions rise up. Is this a case for associative arrays as
> implemented in Mata? Is there a cunning restructuring of the data from
> which the answer would fall out easily? Precise inspiration was
> lacking and what seemed crucial was that you need to consider each
> actor in each combination of project and year. That pointed out to
> loops over actors _and_ over project-years. Once that idea was taken
> up, life is usually easier if all identifiers run over the integers
> from 1 up. Also, the flavour of compiling a list and eventually
> counting distinct members of other actors suggested -levelsof- and the
> list manipulation tools documented at -help macrolists-.
>
> So, here is my code. Absolutely nothing rules out other kinds of solutions.
>
> input year project_id actor_id condition wanted
> 2000 1 1 1 2
> 2000 1 2 0 1
> 2000 1 3 0 1
> 2000 1 7 1 2
> 2000 2 1 1 2
> 2000 2 2 0 1
> 2000 2 3 0 1
> 2000 3 4 1 2
> 2000 3 5 0 1
> 2000 3 6 0 1
> 2000 3 . . .
> 2001 4 1 1 2
> 2001 4 2 0 1
> 2001 4 3 0 1
> end
>
> * identifiers guaranteed to run 1 up if the real ones don't!
> * note that "same project, same year" defines a group
> egen proj = group(project_id year), label
> su proj, meanonly
> local nproj = r(max)
>
> egen act = group(actor_id), label
> su act, meanonly
> local nact = r(max)
>
> gen mywanted = .
>
> * lists of those in each project and year and condition == 0
> qui forval p = 1/`nproj' {
> levelsof act if proj == `p' & condition == 0, local(who`p')
> }
>
> macro list
>
> * now cycle over actors
> qui forval a = 1/`nact' {
>
> * blank out workspace
> local work
>
> * if actor was included, we want to add that list to workspace
> * in practice -if r(N)- will be true if and only if -r(N)- is positive
> forval p = 1/`nproj' {
> count if act == `a' & proj == `p'
> if r(N) local work `work' `who`p''
> }
>
> * remove duplicates
> local work : list uniq work
> * remove this actor
> local work : list work - a
> * see what we got for debugging
> noi di "`a' `work'"
>
> replace mywanted = `: list sizeof work' if act == `a'
> }
>
> Nick
>
>
> On Tue, Oct 16, 2012 at 9:52 AM, Erik Aadland <erikaadland@hotmail.com> wrote:
>
> > I am trying to generate a variable "wanted" that by each focal actor and year captures the total number of unique actors (excluding the focal actor) that meet a specified condition (condition == 0) and that the focal actor has occured together with in one or more projects.
> > This is my data structure:
> > year project_id actor_id condition wanted
> > 2000 1 1 1 2
> > 2000 1 2 0 1
> > 2000 1 3 0 1
> > 2000 1 7 1 2
> > 2000 2 1 1 2
> > 2000 2 2 0 1
> > 2000 2 3 0 1
> > 2000 3 4 1 2
> > 2000 3 5 0 1
> > 2000 3 6 0 1
> > 2000 3 . . .
> > 2001 4 1 1 2
> > 2001 4 2 0 1
> > 2001 4 3 0 1
> > .....and so on
> > So in year == 2000, actor_id == 1 has occurred with 2 unique actor_id (namely 2 and 3) meeting condition == 0 in projects. Therefore, wanted == 2 for actor_id == 1 in year == 2000.
> > My attempted code (which is quite wrong):
> > sort actor_id year projects ;
> > by actor_id year: gen nvals = _n == 1 ;
> > sort actor_id year project_id ;
> > egen wanted = total(nvals & condition == 0), by(agency_id year) ;
> > replace wanted = wanted - (nvals & condition == 0) ;
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index