Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
Date   Tue, 16 Oct 2012 14:48:36 +0100

It seems that you want a separate count for each year.

If that's so, the code looks more like

clear

input year    project_id    actor_id    condition
2000    1             1           1
2000    2             1           1
2000    1             2           0
2000    2             2           0
2000    1             3           0
2000    2             3           0
2000    3             4           1
2000    3             5           0
2000    3             6           0
2000    4             7           0
2001    5             1           1
2001    5             2           0
2001    6             2           0
2001    5             3           0
2001    6             3           0
2001    5             4           1
2001    6             4           1
2001    7             5           0
2001    7             6           0
2001    7             7           0
2001    8             8           0

end

egen proj = group(project_id year), label
su proj, meanonly
local nproj = r(max)

egen act = group(actor_id), label
su act, meanonly
local nact = r(max)

egen yr = group(year), label
su yr, meanonly
local nyr = r(max)

gen mywanted = .

* lists of those in each project and year and condition == 0
qui forval p = 1/`nproj' {
	levelsof act if proj == `p' & condition == 0, local(who`p')
}

macro list

* cycle over years

qui forval y = 1/`nyr' {

* now cycle over actors
	qui forval a = 1/`nact' {

* blank out workspace
		local work

* if actor was included, we want to add that list to workspace
		forval p = 1/`nproj' {
			count if act == `a' & proj == `p' & yr == `y'
			if r(N) local work `work' `who`p''
		}

* remove duplicates
	local work : list uniq work
* remove this actor
	local work : list work - a
* see what we got for debugging
	noi di "`a'       `work'"

	replace mywanted = `: list sizeof work' if act == `a' & yr == `y'
}
}




On Tue, Oct 16, 2012 at 1:00 PM, Erik Aadland <erikaadland@hotmail.com> wrote:
> This is correct.
> So, referring to the results in my previous post.
> In year==2000, actor_id == 4|5|6 occur only in project_id==3, and for actor_id== 5 and 6 condition==0. Actor_id==4 should have a mywanted score == 2, while actor_id==5 and 6 should each have a mywanted score == 1. Actor_id == 7 occurs only in project_id==4 this year and has shared projects with none other in this year (and therefore shares no project_id with any actor_id with condition==0) and should have a mywantedscore == 0.
> It puzzles me why the suggested code generates correct mywanted scores for the actor_ids in project_id==1 and 2, but not in project_id== 3 and 4.
> Kind regards,
> Erik.
>
>
> ----------------------------------------
>> Date: Tue, 16 Oct 2012 12:44:11 +0100
>> Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
>> From: njcoxstata@gmail.com
>> To: statalist@hsphsun2.harvard.edu
>>
>> What you asked for, as I understood it, was the total number of distinct actors
>>
>> 1. that meet a specified condition (condition == 0)
>>
>> and
>>
>> 2. with which any actor has shared one or more projects in the same year.
>>
>> (I am ignoring the word "focal", which you haven't defined and I don't
>> understand.)
>>
>> If you want something else, please give the definition. It's not
>> enough (for me) to say that the code gives the wrong answer in some
>> cases. Note that my code gives the same answer for each actor, as it
>> is a total over all (project, year) possibilities. If you want a
>> different count for each (project, year) you'll need to modify the
>> code accordingly.
>>
>> Nick
>>
>> On Tue, Oct 16, 2012 at 12:25 PM, Erik Aadland <erikaadland@hotmail.com> wrote:
>> > The code works. Thank you Nick.
>> > However, I am experiencing a few problems that I suspect stem from more detailed differences in my data structure. Detailed differences that depart from the structure I previously specified in this post.
>> > In particular, I might have some projects in which only one actor is present.
>> > Showing by example is perhaps easiest. Here is the result from Nick's code (based on my previously supplied data structure) on a slightly expanded dataset:
>> > year project_id actor_id condition proj act mywanted
>> > 2000 1 1 1 1 2000 1 2
>> > 2000 2 1 1 2 2000 1 2
>> > 2000 1 2 0 1 2000 2 1
>> > 2000 2 2 0 2 2000 2 1
>> > 2000 1 3 0 1 2000 3 1
>> > 2000 2 3 0 2 2000 3 1
>> > 2000 3 4 1 3 2000 4 4
>> > 2000 3 5 0 3 2000 5 2
>> > 2000 3 6 0 3 2000 6 2
>> > 2000 4 7 0 4 2000 7 2
>> > 2001 5 1 1 5 2001 1 2
>> > 2001 5 2 0 5 2001 2 1
>> > 2001 6 2 0 6 2001 2 1
>> > 2001 5 3 0 5 2001 3 1
>> > 2001 6 3 0 6 2001 3 1
>> > 2001 5 4 1 5 2001 4 4
>> > 2001 6 4 1 6 2001 4 4
>> > 2001 7 5 0 7 2001 5 2
>> > 2001 7 6 0 7 2001 6 2
>> > 2001 7 7 0 7 2001 7 2
>> > 2001 8 8 0 8 2001 8 0
>> >
>> > In this result (focusing on year==2000 only now), mywanted scores for actor_id==7 in project_id==4 is incorrect (correct mywanted==0). The mywanted scores for actor_ids in project_id==3 are also incorrect.
>> >
>> > In year==2001, the mywanted score==0 for actor_id==8 in project_id==8 is on the other hand correct.
>> > How get around this? I am sorry that I did not include these structural details in my initial post.
>> > Sincerely,
>> > Erik.
>> >
>> > ----------------------------------------
>> >> Date: Tue, 16 Oct 2012 11:49:22 +0100
>> >> Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
>> >> From: njcoxstata@gmail.com
>> >> To: statalist@hsphsun2.harvard.edu
>> >>
>> >> I suspect that you didn't copy all the code. The last line of code
>> >> has a brace (curly bracket }) by itself.
>> >>
>> >> On Tue, Oct 16, 2012 at 11:44 AM, Erik Aadland <erikaadland@hotmail.com> wrote:
>> >> > Thank you Nick!
>> >> > You are quite right. I was imprecise; it is distinct actors I want to capture.
>> >> > When I run your suggested code, I get this error message after the following line of code:
>> >> >
>> >> > qui forval a = 1/`nact' {
>> >> > unexpected end of file
>> >> > r(612);
>> >> >
>> >> > What could possibly cause this error message? I am using Stata 10.
>> >> > Thanks again and kind regards,
>> >> > Erik.
>> >> >
>> >> >
>> >> > ----------------------------------------
>> >> >> Date: Tue, 16 Oct 2012 11:19:41 +0100
>> >> >> Subject: Re: st: Capturing unique actors meeting conditions by focal actor and year (excluding focal actor)
>> >> >> From: njcoxstata@gmail.com
>> >> >> To: statalist@hsphsun2.harvard.edu
>> >> >>
>> >> >> First off, on my list of hobby-horses is a prejudice that the word
>> >> >> "unique" is misused here, although you are in very good company:
>> >> >> StataCorp itself does it in various places, e.g. -codebook-, although
>> >> >> I am working on changing their habits if I can. The word "unique"
>> >> >> strictly means occurring once only; I recommend the word "distinct"
>> >> >> for what you want. There is a longer discussion of terminology in
>> >> >>
>> >> >> SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations
>> >> >> (help distinct if installed) . . . . . . N. J. Cox and G. M. Longton
>> >> >> Q4/08 SJ 8(4):557--568
>> >> >> shows how to answer questions about distinct observations
>> >> >> from first principles; provides a convenience command
>> >> >>
>> >> >> That said, when faced with a problem like yours, vague ideas of
>> >> >> possible solutions rise up. Is this a case for associative arrays as
>> >> >> implemented in Mata? Is there a cunning restructuring of the data from
>> >> >> which the answer would fall out easily? Precise inspiration was
>> >> >> lacking and what seemed crucial was that you need to consider each
>> >> >> actor in each combination of project and year. That pointed out to
>> >> >> loops over actors _and_ over project-years. Once that idea was taken
>> >> >> up, life is usually easier if all identifiers run over the integers
>> >> >> from 1 up. Also, the flavour of compiling a list and eventually
>> >> >> counting distinct members of other actors suggested -levelsof- and the
>> >> >> list manipulation tools documented at -help macrolists-.
>> >> >>
>> >> >> So, here is my code. Absolutely nothing rules out other kinds of solutions.
>> >> >>
>> >> >> input year project_id actor_id condition wanted
>> >> >> 2000 1 1 1 2
>> >> >> 2000 1 2 0 1
>> >> >> 2000 1 3 0 1
>> >> >> 2000 1 7 1 2
>> >> >> 2000 2 1 1 2
>> >> >> 2000 2 2 0 1
>> >> >> 2000 2 3 0 1
>> >> >> 2000 3 4 1 2
>> >> >> 2000 3 5 0 1
>> >> >> 2000 3 6 0 1
>> >> >> 2000 3 . . .
>> >> >> 2001 4 1 1 2
>> >> >> 2001 4 2 0 1
>> >> >> 2001 4 3 0 1
>> >> >> end
>> >> >>
>> >> >> * identifiers guaranteed to run 1 up if the real ones don't!
>> >> >> * note that "same project, same year" defines a group
>> >> >> egen proj = group(project_id year), label
>> >> >> su proj, meanonly
>> >> >> local nproj = r(max)
>> >> >>
>> >> >> egen act = group(actor_id), label
>> >> >> su act, meanonly
>> >> >> local nact = r(max)
>> >> >>
>> >> >> gen mywanted = .
>> >> >>
>> >> >> * lists of those in each project and year and condition == 0
>> >> >> qui forval p = 1/`nproj' {
>> >> >> levelsof act if proj == `p' & condition == 0, local(who`p')
>> >> >> }
>> >> >>
>> >> >> macro list
>> >> >>
>> >> >> * now cycle over actors
>> >> >> qui forval a = 1/`nact' {
>> >> >>
>> >> >> * blank out workspace
>> >> >> local work
>> >> >>
>> >> >> * if actor was included, we want to add that list to workspace
>> >> >> * in practice -if r(N)- will be true if and only if -r(N)- is positive
>> >> >> forval p = 1/`nproj' {
>> >> >> count if act == `a' & proj == `p'
>> >> >> if r(N) local work `work' `who`p''
>> >> >> }
>> >> >>
>> >> >> * remove duplicates
>> >> >> local work : list uniq work
>> >> >> * remove this actor
>> >> >> local work : list work - a
>> >> >> * see what we got for debugging
>> >> >> noi di "`a' `work'"
>> >> >>
>> >> >> replace mywanted = `: list sizeof work' if act == `a'
>> >> >> }
>> >> >>
>> >> >> Nick
>> >> >>
>> >> >>
>> >> >> On Tue, Oct 16, 2012 at 9:52 AM, Erik Aadland <erikaadland@hotmail.com> wrote:
>> >> >>
>> >> >> > I am trying to generate a variable "wanted" that by each focal actor and year captures the total number of unique actors (excluding the focal actor) that meet a specified condition (condition == 0) and that the focal actor has occured together with in one or more projects.
>> >> >> > This is my data structure:
>> >> >> > year project_id actor_id condition wanted
>> >> >> > 2000 1 1 1 2
>> >> >> > 2000 1 2 0 1
>> >> >> > 2000 1 3 0 1
>> >> >> > 2000 1 7 1 2
>> >> >> > 2000 2 1 1 2
>> >> >> > 2000 2 2 0 1
>> >> >> > 2000 2 3 0 1
>> >> >> > 2000 3 4 1 2
>> >> >> > 2000 3 5 0 1
>> >> >> > 2000 3 6 0 1
>> >> >> > 2000 3 . . .
>> >> >> > 2001 4 1 1 2
>> >> >> > 2001 4 2 0 1
>> >> >> > 2001 4 3 0 1
>> >> >> > .....and so on
>> >> >> > So in year == 2000, actor_id == 1 has occurred with 2 unique actor_id (namely 2 and 3) meeting condition == 0 in projects. Therefore, wanted == 2 for actor_id == 1 in year == 2000.
>> >> >> > My attempted code (which is quite wrong):
>> >> >> > sort actor_id year projects ;
>> >> >> > by actor_id year: gen nvals = _n == 1 ;
>> >> >> > sort actor_id year project_id ;
>> >> >> > egen wanted = total(nvals & condition == 0), by(agency_id year) ;
>> >> >> > replace wanted = wanted - (nvals & condition == 0) ;
>> >> >>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index