Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: creating summary variables with overlapping peer groups


From   "Assistant, Research" <Research.Assistant@deval.org>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   AW: st: creating summary variables with overlapping peer groups
Date   Tue, 30 Jul 2013 08:44:54 +0000

Dear Nick,

thank you for your response. I programmed it the way suggested in the link you sent me. However, this way is just as slow as the one using the summary command. I guess there is no quicker way as STATA needs to run through every single observation. If you have another suggestion I would be grateful for your reply. The code I wrote this morning looks like:

gen idnum=_n
gen mean=.
qui forval i=1/`=_N' {
gen include=1 if (Cluster==Cluster[`i'] & _n!=`i') | Cluster==neighbor1[`i'] | Cluster==neighbor2[`i']
egen average=mean(variable1*include)
replace mean=average if idnum==`i'
drop include average
}

All the best,
Max



-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Nick Cox
Gesendet: Montag, 29. Juli 2013 16:48
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: creating summary variables with overlapping peer groups

Start with

http://www.stata.com/support/faqs/data-management/creating-variables-recording-properties/

Nick
njcoxstata@gmail.com

On 29 July 2013 14:55, Maximilian Linek

> I am looking for an efficient way to create a variable for each individual (observation), which contains her group mean without her own value. However, each individual forms part in several groups. The problem is posed in a neighborhood or peer group analysis.
>
> My data looks like the following.
>
> ID      Cluster         neighbor1       neighbor2       variable1
> A       1               2               3               1
> B       1               2               3               0
> C       2               5               1               1
> D       2               5               1               1
> E       3               1               4               1
> F       4               3               5               0
> G       5               2               4               0
>
>
> ID is the individual identification of each individual; Cluster is the neighborhood in which an individual lives; neighbor1 is the nearest adjacent neighborhood; neighbor2 is the second most closest adjacent neighborhood; variable1 is the variable I want to generate the mean over for each individual.
>
> In this respect individual A is in the same neighborhood with individual B and in an adjacent neighborhood with individuals C, D, and E. The variable I want to generate is the mean of this peer group without the own observation. The value for variable1 is 0 for B and 1 for C, D, and E. That means the mean I would like to generate for individual A is hence 0.75. (The same for individual B would be 1 and so on...)
>
> One solution, which unfortunately is very inefficient, is given by:
>
> gen mean=.
> forval i=1/`=_N' {
>         summarize variable1 if (Cluster==Cluster[`i'] & _n!=`i') | Cluster==neighbor1[`i'] ///
>         | Cluster==neighbor2[`i'], meanonly
>         quietly replace mean=r(mean) in `i'
> }
>
> I am looking for an efficient way to do the above.
>
> Furthermore, to sophisticate the above analyses I would like to weigh the impact of the own and the adjacent neighborhoods in the calculation. This means e.g. own neighborhood mean (without own observation) enters the summary variable calculation for each individual with a weight of 0.5, neighbor1 mean with a weight of 0.3 and neighbor2 mean with a weight of 0.2.
>
> A last extension, which I am interested in, is how the observations entering the calculation of the summary variable can be confined to observations which fall into an age span around an individual's own age. That means: an individual aged 25 shall consider only individuals that are between 22 and 28 as her  peer group and only individuals in the own or adjacent neighborhood which fall into this age span are considered in the calculation of the mean.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
________________________________
Deutsches Evaluierungsinstitut der Entwicklungszusammenarbeit gGmbH;
Sitz der Gesellschaft Bonn/Registered Office Bonn, Germany;
Registergericht/Registered at Amtsgericht Bonn, Germany; Eintragungs-Nr./Registration no. HRB 19016;
USt-IdNr DE 280688706;
Geschäftsführung/Management:Prof. Dr. Helmut Asche

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index