Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Friends' characteristics


From   "Stas Kolenikov" <[email protected]>
To   [email protected]
Subject   Re: st: Friends' characteristics
Date   Thu, 31 Aug 2006 09:06:12 -0500

I would go with a merge, something like

tempfile friend1 friend2 friend3 friend4

preserve
keep id gpa
rename id friend
forvalues k=1/4 {
  rename friend friend`k'
  rename gpa gpa_f`k'
  // note that the mask friend will be matched to friend1 when k==2, etc.
 sort id`k'
 save `friend`k''
}
restore

forvalues k=1/4 {
  sort friend`k'
  merge friend`k' using `friend`k''
}

egen peer_gpa = rmean(gpa_f*)

Of course I have not tried it working, but it should give you an idea.
I don't know if it is going to be much faster (and it very well might
be), but it is also somewhat clearer, I think.

On 8/30/06, Chris Ruebeck <[email protected]> wrote:
(Previously sent but didn't see it appear on Statalist.)

Suppose my data set has these 6 variables,

        id : this respondent's ID,
        gpa : this respondent's GPA, and
        friend1-4 : the IDs (possibly missing) of this respondent's friends.

I would like to create four new variables that record the GPA of each
respondent's friends, and then take their average.  I have many
observations and want to avoid slower methods.  Here is my code for
the first friend.

gen gpaf1 = .
egen group = group(friend1)
summarize group, meanonly
foreach num 1 / `r(max)' {
        summarize id if group==`num', meanonly
        local idf = r(mean)
        summarize gpa if id==`idf', meanonly
        replace gpaf1 = r(mean) if group==`num'
}

I figure I can nest this in a forvalues loop from 1-4, and then use -
egen ... rowmean(gpaf1-4)- to get the mean over friends.  In the code
above, levelsof could replace the -egen ... group(friend1)- but macro
length limits would require splitting the friends' ids into two to
four groups.

Is there a faster method, perhaps with Mata?

(An additional wrinkle: some friends may no longer be in the
database---so an observation's friend1, for example, may contain a
number that is not the id of any observation.  I think the code above
is robust to that problem, but perhaps this is another potential
speed improvement.)

--
Stas Kolenikov
http://stas.kolenikov.name
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index