Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: obtaining statistics from clustered data


From   "Tony Brady" <tony@sealedenvelope.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: obtaining statistics from clustered data
Date   Wed, 26 Jun 2002 21:03:19 +0100

Interesting question... my solution is to use egen within a loop that omits
each observation in turn from the cluster. In this code I'm averaging sbp
within clusters defined by id. Obviously substitute in your own variable
names for id, sbp and msbp (the variable that ends up with your statistic of
interest).

sort id
by id: gen rank=_n
summ rank
local hirank=r(max)
gen msbp=.
foreach i of numlist 1/`hirank' {
    egen temp=mean(sbp) if rank!=`i', by(id)
    by id: replace msbp=temp[_N] if rank==`i' & `i'<_N
    by id: replace msbp=temp[1] if rank==`i' & `i'==_N
    drop temp
}
drop rank

You can change the egen line to, for instance, egen temp=sd(sbp)... if you
want standard deviations instead of means or any other statistic offered by
egen. Using egen isn't an elegant solution but it does offer some
flexibility and avoids the need to write lots of code.

Tony

___________________
Tony Brady
Sealed Envelope Ltd

> Date: Tue, 25 Jun 2002 22:30:54 -0500
> From: "Javier Escobal" <jescobal@grade.org.pe>
> Subject: st: programming question: obtaining statistics from clustered
data
>
> Dear Stata listers
>
> I need to construct a number of summary statistics for clustered data
> with a particular twist: for each observation the summary statistics
> should be constructed without taking into the account  the reference
> observation.
>
> I have a data base that has the following form:
>
> id    cluster    X
> 1        1        0.5
> 2        1        0.7
> 3        1        0.4
> .        .         .
> .        .         .
> .        .         .
> 100      3       0.6
> 101      3       0.6
> 102      3       0.8
> 103      3       0.2
>
> that is observations can be grouped in clusters (of different size). I
> am interested in constructing different statistics: for example for each
> observation "i" I need to capture the average and standard deviation of
> all observations that belong to the same cluster where "i" belongs
> excluding observation "i".
>
> Can somebody help me with a simple way of constructing this aggregates?
>
> I appreciate the response
>
> Javier
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> ------------------------------
>
> End of statalist-digest V4 #927
> *******************************
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index