Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1 |

Date |
Wed, 18 May 2011 01:00:02 +0100 |

gen all_score = sum(score) should be gen all_score = sum(ind_score) On Tue, May 17, 2011 at 11:32 PM, Nick Cox <njcoxstata@gmail.com> wrote: > Thanks. I (think I) am understanding more. Now sounds like > > 0. Tag individuals' first entries > > bysort ind_id (yearmonth) : gen ind_score = sum(ind_entry) > by ind_id : replace ind_score = ind_score == 1 & ind_score[_n-1] != 1 > > /// != 1 not == 0 to catch ind_entry == 1 when _n == 1 for which > ind_entry[_n-1] is missing > > See also > > FAQ . . . . . . . . . . . . . . . First and last occurrences in panel data > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox > 3/07 How can I identify first and last occurrences > systematically in panel data? > http://www.stata.com/support/faqs/data/firstoccur.html > > > > 1. Get the sum of all individual first entries > > sort yearmonth > gen all_score = sum(score) > by yearmonth : replace all_score = all_score[_N] > > 2. Subtract this individual > > replace all_score = all_score - ind_score > > 3. Look one step back in time > > bysort ind_id (yearmonth) : gen prevscore = allscore[_n-1] > > Nick > > On Tue, May 17, 2011 at 7:38 PM, Erik Aadland <erikaadland@hotmail.com> wrote: >> My apologies if I am being imprecise. >> >> I am struggling in my attempt to create the following two variables. >> >> One variable that yields a score for each individual in the dataset for each yearmonth: the score is the sum of all ind_entry = 1 for all the other individuals excluding the focal individual up until and including time: yearmonth - 1. >> >> And a second variable that identifies for each individual in the dataset for each yearmonth: the total number of other individuals in the dataset that have experienced ind_entry = 1 at least once up until and including time: yearmonth - 1. In this variable, I want to count the number of individuals that have experienced ind_entry = 1, not how many times they have entered. > > Nick > >>> I naturally have no idea what -ind_entry- means. >>> >>> I am just summing it, as the code implies. Is that not what you want? > > Erik Aadland > >>> Thank you so much for your input, Nick. >>> >>> I have experimented and generated different variables previously relying on the very helpful FAQ in question. >>> >>> I am struggling with this problem however. When I apply the suggested code below, it appears that the calculation of the number of peers adds up more than it should for ind_ids with more ind_entry = 1 relative to other ind_ids, and consequently contribute more to the "score" than those with fewer ind_entry = 1. >>> >>> Referring to the example dataset, ind_id 2 is given the correct "prevscore". Ind_id 4, however, is not. By yearmonth 12, ind_id 4 has contributed 2 ind_entry=1 to the "score", which is correct for ind_id 2. However, ind_id 2 has not yet experienced ind_entry=1. Consequently, score - 1 for ind_id 4 yields a score = 1 in yearmonth 11 and 12, when the correct score = 0. And so on. >>> >>> Here is the suggested code as I applied it: >>> >>> clear ; >>> #delimit ; >>> use "ind_entry_ex.dta" ; >>> sort yearmonth ; >>> gen score = sum(ind_entry) ; >>> by yearmonth: replace score = score[_N] ; >>> replace score = score - ind_entry ; >>> bysort ind_id (yearmonth): gen prevscore = score[_n-1] ; >>> >>> >>> Here is the output: >>> >>> year month yearmonth ind_id ind_entry score prevscore >>> 2003 10 10 2 0 1 >>> 2003 11 11 2 0 1 1 >>> 2003 12 12 2 0 2 1 >>> 2004 1 13 2 0 2 2 >>> 2004 2 14 2 1 2 2 >>> 2004 3 15 2 0 3 2 >>> 2003 10 10 4 1 0 >>> 2003 11 11 4 0 1 0 >>> 2003 12 12 4 1 1 1 >>> 2004 1 13 4 0 2 1 >>> 2004 2 14 4 0 3 2 >>> 2004 3 15 4 0 3 3 >>> >>> I use Stata 10. > > Nick > >>> > I don't know what "I am familiar with" means here. Does it mean that you've read the FAQ but can't see how to apply it? >>> > >>> > This sounds to me like >>> > >>> > 1. Get the sum of all individual entries >>> > >>> > sort yearmonth >>> > gen score = sum(ind_entry) >>> > by yearmonth : replace score = score[_N] >>> > >>> > 2. Subtract this individual >>> > >>> > replace score = score - ind_entry >>> > >>> > 3. Look one step back in time >>> > >>> > bysort ind_id (yearmonth) : gen prevscore = score[_n-1] > > > Erik Aadland > >>> > I need to create a variable that sums for each individual in my dataset the total number of ind_entry of all other individuals at time: yearmonth - 1. >>> > I have attached a small ex of my data structure below. So for instance, given the small dataset below, for ind_id 2 in yearmonth 11 this variable score = 1. But for ind_id 4 in the same yearmonth, the score = 0. >>> > >>> > I would also like to generate a variable that identifies for each individual the unique number of other individuals in the dataset that have experienced ind_entry = 1 at least once up until time: yearmonth - 1. >>> > >>> > I am familiar with the following FAQ: http://www.stata.com/support/faqs/data/members.html >>> > >>> > My data structure is snapshot data in principle like the example below, but some individuals enter the observation window later than others (i.e. in later yearmonths): >>> > >>> > year month yearmonth ind_id ind_entry >>> > 2003 10 10 2 0 >>> > 2003 11 11 2 0 >>> > 2003 12 12 2 0 >>> > 2004 1 13 2 0 >>> > 2004 2 14 2 1 >>> > 2004 3 15 2 0 >>> > 2003 10 10 4 1 >>> > 2003 11 11 4 0 >>> > 2003 12 12 4 1 >>> > 2004 1 13 4 0 >>> > 2004 2 14 4 0 >>> > 2004 3 15 4 0 > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: creating variable summarizing for each individual properties of other members of a group at t-1***From:*Erik Aadland <erikaadland@hotmail.com>

**st: RE: creating variable summarizing for each individual properties of other members of a group at t-1***From:*Nick Cox <n.j.cox@durham.ac.uk>

**RE: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1***From:*Erik Aadland <erikaadland@hotmail.com>

**RE: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1***From:*Nick Cox <n.j.cox@durham.ac.uk>

**RE: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1***From:*Erik Aadland <erikaadland@hotmail.com>

**Re: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: -gammafit-, -pgamma-, -qgamma- packages revised on SSC** - Next by Date:
**st: from Stata to MATA** - Previous by thread:
**Re: st: RE: creating variable summarizing for each individual properties of other members of a group at t-1** - Next by thread:
**Re: st: creating variable summarizing for each individual properties of other members of a group at t-1** - Index(es):