Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Friends' characteristics


From   Chris Ruebeck <ruebeckc@lafayette.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Friends' characteristics
Date   Fri, 1 Sep 2006 09:20:43 -0400

Thanks!  I see the key is using -rename- and -merge- .

Chris


On Aug 31, 2006, at 10:06 AM, Stas Kolenikov wrote:

I would go with a merge, something like

tempfile friend1 friend2 friend3 friend4

preserve
keep id gpa
rename id friend
forvalues k=1/4 {
rename friend friend`k'
rename gpa gpa_f`k'
// note that the mask friend will be matched to friend1 when k==2, etc.
sort id`k'
save `friend`k''
}
restore

forvalues k=1/4 {
sort friend`k'
merge friend`k' using `friend`k''
}

egen peer_gpa = rmean(gpa_f*)

Of course I have not tried it working, but it should give you an idea.
I don't know if it is going to be much faster (and it very well might
be), but it is also somewhat clearer, I think.

On 8/30/06, Chris Ruebeck <ruebeckc@lafayette.edu> wrote:

(Previously sent but didn't see it appear on Statalist.)

Suppose my data set has these 6 variables,

id : this respondent's ID,
gpa : this respondent's GPA, and
friend1-4 : the IDs (possibly missing) of this respondent's friends.

I would like to create four new variables that record the GPA of each
respondent's friends, and then take their average. I have many
observations and want to avoid slower methods. Here is my code for
the first friend.

gen gpaf1 = .
egen group = group(friend1)
summarize group, meanonly
foreach num 1 / `r(max)' {
summarize id if group==`num', meanonly
local idf = r(mean)
summarize gpa if id==`idf', meanonly
replace gpaf1 = r(mean) if group==`num'
}

I figure I can nest this in a forvalues loop from 1-4, and then use -
egen ... rowmean(gpaf1-4)- to get the mean over friends. In the code
above, levelsof could replace the -egen ... group(friend1)- but macro
length limits would require splitting the friends' ids into two to
four groups.

Is there a faster method, perhaps with Mata?

(An additional wrinkle: some friends may no longer be in the
database---so an observation's friend1, for example, may contain a
number that is not the id of any observation. I think the code above
is robust to that problem, but perhaps this is another potential
speed improvement.)

--
Stas Kolenikov
http://stas.kolenikov.name
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index