| From | "Nick Winter" <nwinter@policystudies.com> |
| To | <statalist@hsphsun2.harvard.edu> |
| Subject | RE: st: programming question: obtaining statistics from clustered data |
| Date | Wed, 26 Jun 2002 09:40:01 -0400 |
-----Original Message----- For the standard deviation the answer seems to be more difficult. At the moment I only can think about a solution with a loop over the observations within each cluster. There must be a better solution and I am sure that I have overlooked somethink obvious. But anyway, you may use the following as a starting point: >>>>>>>>> You can use the alternate expression for variance to do the standard deviation relatively straightforwardly: gen X2=X^2 sort cluster by cluster: gen sumX=sum(X) by cluster: replace sumX=sumX[_N] - X by cluster: gen sumX2=sum(X2) by cluster: replace sumX2=sumX2[_N] - X2 by cluster: gen sd1 = sqrt(((_N-1)*sumX2 - sumX^2)/((_N-1)*(_N-2))) --Nick Winter
<<winmail.dat>>