Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Erik Aadland <erikaadland@hotmail.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor) |

Date |
Tue, 16 Oct 2012 12:00:40 +0000 |

This is correct. So, referring to the results in my previous post. In year==2000, actor_id == 4|5|6 occur only in project_id==3, and for actor_id== 5 and 6 condition==0. Actor_id==4 should have a mywanted score == 2, while actor_id==5 and 6 should each have a mywanted score == 1. Actor_id == 7 occurs only in project_id==4 this year and has shared projects with none other in this year (and therefore shares no project_id with any actor_id with condition==0) and should have a mywantedscore == 0. It puzzles me why the suggested code generates correct mywanted scores for the actor_ids in project_id==1 and 2, but not in project_id== 3 and 4. Kind regards, Erik. ---------------------------------------- > Date: Tue, 16 Oct 2012 12:44:11 +0100 > Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor) > From: njcoxstata@gmail.com > To: statalist@hsphsun2.harvard.edu > > What you asked for, as I understood it, was the total number of distinct actors > > 1. that meet a specified condition (condition == 0) > > and > > 2. with which any actor has shared one or more projects in the same year. > > (I am ignoring the word "focal", which you haven't defined and I don't > understand.) > > If you want something else, please give the definition. It's not > enough (for me) to say that the code gives the wrong answer in some > cases. Note that my code gives the same answer for each actor, as it > is a total over all (project, year) possibilities. If you want a > different count for each (project, year) you'll need to modify the > code accordingly. > > Nick > > On Tue, Oct 16, 2012 at 12:25 PM, Erik Aadland <erikaadland@hotmail.com> wrote: > > The code works. Thank you Nick. > > However, I am experiencing a few problems that I suspect stem from more detailed differences in my data structure. Detailed differences that depart from the structure I previously specified in this post. > > In particular, I might have some projects in which only one actor is present. > > Showing by example is perhaps easiest. Here is the result from Nick's code (based on my previously supplied data structure) on a slightly expanded dataset: > > year project_id actor_id condition proj act mywanted > > 2000 1 1 1 1 2000 1 2 > > 2000 2 1 1 2 2000 1 2 > > 2000 1 2 0 1 2000 2 1 > > 2000 2 2 0 2 2000 2 1 > > 2000 1 3 0 1 2000 3 1 > > 2000 2 3 0 2 2000 3 1 > > 2000 3 4 1 3 2000 4 4 > > 2000 3 5 0 3 2000 5 2 > > 2000 3 6 0 3 2000 6 2 > > 2000 4 7 0 4 2000 7 2 > > 2001 5 1 1 5 2001 1 2 > > 2001 5 2 0 5 2001 2 1 > > 2001 6 2 0 6 2001 2 1 > > 2001 5 3 0 5 2001 3 1 > > 2001 6 3 0 6 2001 3 1 > > 2001 5 4 1 5 2001 4 4 > > 2001 6 4 1 6 2001 4 4 > > 2001 7 5 0 7 2001 5 2 > > 2001 7 6 0 7 2001 6 2 > > 2001 7 7 0 7 2001 7 2 > > 2001 8 8 0 8 2001 8 0 > > > > In this result (focusing on year==2000 only now), mywanted scores for actor_id==7 in project_id==4 is incorrect (correct mywanted==0). The mywanted scores for actor_ids in project_id==3 are also incorrect. > > > > In year==2001, the mywanted score==0 for actor_id==8 in project_id==8 is on the other hand correct. > > How get around this? I am sorry that I did not include these structural details in my initial post. > > Sincerely, > > Erik. > > > > ---------------------------------------- > >> Date: Tue, 16 Oct 2012 11:49:22 +0100 > >> Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor) > >> From: njcoxstata@gmail.com > >> To: statalist@hsphsun2.harvard.edu > >> > >> I suspect that you didn't copy all the code. The last line of code > >> has a brace (curly bracket }) by itself. > >> > >> On Tue, Oct 16, 2012 at 11:44 AM, Erik Aadland <erikaadland@hotmail.com> wrote: > >> > Thank you Nick! > >> > You are quite right. I was imprecise; it is distinct actors I want to capture. > >> > When I run your suggested code, I get this error message after the following line of code: > >> > > >> > qui forval a = 1/`nact' { > >> > unexpected end of file > >> > r(612); > >> > > >> > What could possibly cause this error message? I am using Stata 10. > >> > Thanks again and kind regards, > >> > Erik. > >> > > >> > > >> > ---------------------------------------- > >> >> Date: Tue, 16 Oct 2012 11:19:41 +0100 > >> >> Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor) > >> >> From: njcoxstata@gmail.com > >> >> To: statalist@hsphsun2.harvard.edu > >> >> > >> >> First off, on my list of hobby-horses is a prejudice that the word > >> >> "unique" is misused here, although you are in very good company: > >> >> StataCorp itself does it in various places, e.g. -codebook-, although > >> >> I am working on changing their habits if I can. The word "unique" > >> >> strictly means occurring once only; I recommend the word "distinct" > >> >> for what you want. There is a longer discussion of terminology in > >> >> > >> >> SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations > >> >> (help distinct if installed) . . . . . . N. J. Cox and G. M. Longton > >> >> Q4/08 SJ 8(4):557--568 > >> >> shows how to answer questions about distinct observations > >> >> from first principles; provides a convenience command > >> >> > >> >> That said, when faced with a problem like yours, vague ideas of > >> >> possible solutions rise up. Is this a case for associative arrays as > >> >> implemented in Mata? Is there a cunning restructuring of the data from > >> >> which the answer would fall out easily? Precise inspiration was > >> >> lacking and what seemed crucial was that you need to consider each > >> >> actor in each combination of project and year. That pointed out to > >> >> loops over actors _and_ over project-years. Once that idea was taken > >> >> up, life is usually easier if all identifiers run over the integers > >> >> from 1 up. Also, the flavour of compiling a list and eventually > >> >> counting distinct members of other actors suggested -levelsof- and the > >> >> list manipulation tools documented at -help macrolists-. > >> >> > >> >> So, here is my code. Absolutely nothing rules out other kinds of solutions. > >> >> > >> >> input year project_id actor_id condition wanted > >> >> 2000 1 1 1 2 > >> >> 2000 1 2 0 1 > >> >> 2000 1 3 0 1 > >> >> 2000 1 7 1 2 > >> >> 2000 2 1 1 2 > >> >> 2000 2 2 0 1 > >> >> 2000 2 3 0 1 > >> >> 2000 3 4 1 2 > >> >> 2000 3 5 0 1 > >> >> 2000 3 6 0 1 > >> >> 2000 3 . . . > >> >> 2001 4 1 1 2 > >> >> 2001 4 2 0 1 > >> >> 2001 4 3 0 1 > >> >> end > >> >> > >> >> * identifiers guaranteed to run 1 up if the real ones don't! > >> >> * note that "same project, same year" defines a group > >> >> egen proj = group(project_id year), label > >> >> su proj, meanonly > >> >> local nproj = r(max) > >> >> > >> >> egen act = group(actor_id), label > >> >> su act, meanonly > >> >> local nact = r(max) > >> >> > >> >> gen mywanted = . > >> >> > >> >> * lists of those in each project and year and condition == 0 > >> >> qui forval p = 1/`nproj' { > >> >> levelsof act if proj == `p' & condition == 0, local(who`p') > >> >> } > >> >> > >> >> macro list > >> >> > >> >> * now cycle over actors > >> >> qui forval a = 1/`nact' { > >> >> > >> >> * blank out workspace > >> >> local work > >> >> > >> >> * if actor was included, we want to add that list to workspace > >> >> * in practice -if r(N)- will be true if and only if -r(N)- is positive > >> >> forval p = 1/`nproj' { > >> >> count if act == `a' & proj == `p' > >> >> if r(N) local work `work' `who`p'' > >> >> } > >> >> > >> >> * remove duplicates > >> >> local work : list uniq work > >> >> * remove this actor > >> >> local work : list work - a > >> >> * see what we got for debugging > >> >> noi di "`a' `work'" > >> >> > >> >> replace mywanted = `: list sizeof work' if act == `a' > >> >> } > >> >> > >> >> Nick > >> >> > >> >> > >> >> On Tue, Oct 16, 2012 at 9:52 AM, Erik Aadland <erikaadland@hotmail.com> wrote: > >> >> > >> >> > I am trying to generate a variable "wanted" that by each focal actor and year captures the total number of unique actors (excluding the focal actor) that meet a specified condition (condition == 0) and that the focal actor has occured together with in one or more projects. > >> >> > This is my data structure: > >> >> > year project_id actor_id condition wanted > >> >> > 2000 1 1 1 2 > >> >> > 2000 1 2 0 1 > >> >> > 2000 1 3 0 1 > >> >> > 2000 1 7 1 2 > >> >> > 2000 2 1 1 2 > >> >> > 2000 2 2 0 1 > >> >> > 2000 2 3 0 1 > >> >> > 2000 3 4 1 2 > >> >> > 2000 3 5 0 1 > >> >> > 2000 3 6 0 1 > >> >> > 2000 3 . . . > >> >> > 2001 4 1 1 2 > >> >> > 2001 4 2 0 1 > >> >> > 2001 4 3 0 1 > >> >> > .....and so on > >> >> > So in year == 2000, actor_id == 1 has occurred with 2 unique actor_id (namely 2 and 3) meeting condition == 0 in projects. Therefore, wanted == 2 for actor_id == 1 in year == 2000. > >> >> > My attempted code (which is quite wrong): > >> >> > sort actor_id year projects ; > >> >> > by actor_id year: gen nvals = _n == 1 ; > >> >> > sort actor_id year project_id ; > >> >> > egen wanted = total(nvals & condition == 0), by(agency_id year) ; > >> >> > replace wanted = wanted - (nvals & condition == 0) ; > >> >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)***From:*Erik Aadland <erikaadland@hotmail.com>

**Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)***From:*Nick Cox <njcoxstata@gmail.com>

**RE: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)***From:*Erik Aadland <erikaadland@hotmail.com>

**Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)***From:*Nick Cox <njcoxstata@gmail.com>

**RE: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)***From:*Erik Aadland <erikaadland@hotmail.com>

**Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)** - Next by Date:
**Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)** - Previous by thread:
**Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)** - Next by thread:
**Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)** - Index(es):