[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: programming question: obtaining statistics from clustered data

From	Ulrich Kohler <[email protected]>
To	[email protected]
Subject	Re: st: programming question: obtaining statistics from clustered data
Date	Wed, 26 Jun 2002 09:42:19 +0000

Javier Escobal  wrote
> I have a data base that has the following form:
>
> id    cluster    X
> 1        1        0.5
> 2        1        0.7
> 3        1        0.4
> ..        .         .
> ..        .         .
> ..        .         .
> 100      3       0.6
> 101      3       0.6
> 102      3       0.8
> 103      3       0.2
>
> that is observations can be grouped in clusters (of different size). I
> am interested in constructing different statistics: for example for each
> observation "i" I need to capture the average and standard deviation of
> all observations that belong to the same cluster where "i" belongs
> excluding observation "i".

For the mean:

.. sort cluster
.. by cluster: gen sumx = sum(X)
.. by cluster: replace sumx = sumx[_N] - X
.. by cluster: gen meanx = sumx/(_N-1)

For the standard deviation the answer seems to be more difficult. At the 
moment I only can think about a solution with a loop over the observations 
within each cluster. There must be a better solution and I am sure that I 
have overlooked somethink obvious. But anyway, you may use the following as a 
starting point: 


 gen temp = .
 gen std = .
 egen group = group(cluster)  /* this might be not necassary */ 
 sort group
 local K = group[_N]
 local last 0
 forvalues k = 1/`K' {
     local first = 1 + `last'
     count if group == `k'
     local N = r(N)
     local last = `first' + (`N'-1)
     forvalues i = `first'/`last' {
        replace temp = .
	replace temp = (X - meanx[`i'])^2 if _n~= `i' & group == `k'
	replace temp = sum(temp)
	replace std = temp[_N]/(`N'-2) if _n== `i'
     }
 }
 drop temp group


regards
uli









*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: programming question: obtaining statistics from clustered data
  - From: "Javier Escobal" <[email protected]>

Prev by Date: st: programming question: obtaining statistics from clustered data
Next by Date: Re: st: Feasible Generalised Least Squares
Previous by thread: st: programming question: obtaining statistics from clustered data
Next by thread: st: RE: programming question: obtaining statistics from clustered data
Index(es):
- Date
- Thread