Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

From
Stas Kolenikov <skolenik@gmail.com>

To
statalist@hsphsun2.harvard.edu

Subject
Re: st: svy + aweights

Date
Thu, 10 Nov 2011 16:59:22 -0500

The nature of the -cluster()- variance estimators is such that they control for any correlation pattern that might be observed within a PSU. This is a non-parametric estimator, and you are probably thinking along the lines of something like GEE. Suppose you have a model y = grand mean + {m==cluser mean} + {u==individual mean} + {e==observation measurement error} with n subjects and k observations per subject. Assume that u and e are homoskedastic. If you have k=1 observation per subject, then you cannot distinguish u and e, and have essentially one error term. The covariance matrix is then Var[u+e] times an exchangeable correlation structure with corr = Var[m]/(Var[m]+Var[u]+Var[e]). If you have multiple observations, k>1, per subject, your covariance matrix is J(kn,kn,Var[m]) + I(n) # J(k,k,Var[u]) + I(nk)*Var[e], which is a more complicated pattern. In GEE, you have to put these structures into the objective function as working correlation structures to get your estimates. With -cluster()-, you don't have to, but you should expect your estimates to be less efficient compared to a situation when the above model were true, and you ran a (feasible) GLS estimation. As long as you have # of clusters -> infinity, you can build a consistent estimator of (within-cluster) Var[y], which will be accounted for in -svy- commands. Hope this helps. On Thu, Nov 10, 2011 at 4:48 PM, Jeph Herrin <stata@spandrel.net> wrote: > I'm not sure I get this. How can correlations at one level be "engulfed" > by correlations at another? The PSUs account for subject level correlation, > but for each subject I have multiple observations. > > Moreover, if I -reshape-, does it still make sense to -svyset psu-? I > thought > not. > > On 11/10/2011 4:40 PM, Stas Kolenikov wrote: >> >> On Thu, Nov 10, 2011 at 4:01 PM, JH<junk@spandrel.net> wrote: >>> >>> But doesn't your suggestion ignore the correlation of observations within >>> subjects? >> >> No. Unless your current -svyset-ting is -svyset _n-... and frankly I >> don't know how that would behave with -reshape-. If you have PSUs in >> your -svyset- (and NHANES does have them), then the correlations of >> observations within the subjects will be engulfed by the correlations >> of observations within the PSUs that -svy:- controls for. >> >> > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

