Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svy + aweights

From   Stas Kolenikov <>
Subject   Re: st: svy + aweights
Date   Thu, 10 Nov 2011 16:59:22 -0500

The nature of the -cluster()- variance estimators is such that they
control for any correlation pattern that might be observed within a
PSU. This is a non-parametric estimator, and you are probably thinking
along the lines of something like GEE.

Suppose you have a model

y = grand mean + {m==cluser mean} + {u==individual mean} +
{e==observation measurement error}

with n subjects and k observations per subject. Assume that u and e
are homoskedastic. If you have k=1 observation per subject, then you
cannot distinguish u and e, and have essentially one error term. The
covariance matrix is then Var[u+e] times an exchangeable correlation
structure with corr = Var[m]/(Var[m]+Var[u]+Var[e]). If you have
multiple observations, k>1, per subject, your covariance matrix is
J(kn,kn,Var[m]) + I(n) # J(k,k,Var[u]) + I(nk)*Var[e], which is a more
complicated pattern. In GEE, you have to put these structures into the
objective function as working correlation structures to get your
estimates. With -cluster()-, you don't have to, but you should expect
your estimates to be less efficient compared to a situation when the
above model were true, and you ran a (feasible) GLS estimation. As
long as you have # of clusters -> infinity, you can build a consistent
estimator of (within-cluster) Var[y], which will be accounted for in
-svy- commands.

Hope this helps.

On Thu, Nov 10, 2011 at 4:48 PM, Jeph Herrin <> wrote:
> I'm not sure I get this. How can correlations at one level be "engulfed"
> by correlations at another? The PSUs account for subject level correlation,
> but for each subject I have multiple observations.
> Moreover, if I -reshape-, does it still make sense to -svyset psu-? I
> thought
> not.
> On 11/10/2011 4:40 PM, Stas Kolenikov wrote:
>> On Thu, Nov 10, 2011 at 4:01 PM, JH<>  wrote:
>>> But doesn't your suggestion ignore the correlation of observations within
>>> subjects?
>> No. Unless your current -svyset-ting is -svyset _n-... and frankly I
>> don't know how that would behave with -reshape-. If you have PSUs in
>> your -svyset- (and NHANES does have them), then the correlations of
>> observations within the subjects will be engulfed by the correlations
>> of observations within the PSUs that -svy:- controls for.
> *
> *   For searches and help try:
> *
> *
> *

Stas Kolenikov, also found at
Small print: I use this email account for mailing lists only.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index