Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: GLLAMM and rescaling sampling weights

From   Susanna Makela <>
Subject   Re: st: GLLAMM and rescaling sampling weights
Date   Fri, 24 Jun 2011 16:14:31 +0530

Stas, thanks so much for your response. I'm a little confused though -
you say the question arises when there are sampling weights for PSUs
and then separate sampling weights for individuals within PSUs - but I
only have sampling weights for women within states. Since, as you say,
I don't have these weights in DHS, do I not need to worry about
weighting at all? Or do the state-level women's weights somehow
combine these two?

I have also looked at the Pfefferman (1998) paper, but I didn't quite
get all of that either... I guess I'll see how things look with option
2 below for now. I was going to use -pwigls-, but that seems to be
only for use with two-level models, though rescaling the weights by
hand isn't hard.

Your response brings up another question: is it bad (as in bad
statistical practice/against statistical theory) to model states as a
random effect if they are also used for stratification? States aren't
really samples from a "population" of all "possible" states in the way
that PSUs in the DHS are a sample from all PSUs, so in that sense you
could argue against modeling them as random effects - but how does
stratification play into it? I thought it would be important to
account for states as another level since the PSUs are nested in
states, and including state-level variables without that structure
could lead to the atomistic fallacy (making incorrect inferences about
state-level variables based on individual-level data).

Thanks again for your help!


The advice on rescaling the weights is tangly at best. Pfeffermann et.
al. (1998, show
that the bias of the variance estimates depends on how you define your
weights. The question arises in the following situation: you have
sampling weights for PSUs, and you have separate sampling weights for
individuals within PSUs (which you don't have in DHS, so you can stop
reading here unless you are really curious about what Rabe-Hesketh and
Skrondal meant in that phrase). Then there's been some inconclusive
research into the following options:

1. leave the individual weights as they are (summing up to the size of the PSU)

2. rescale them so that they sum up to the sample size (# of
observations sampled from a PSU)

3. rescale them to the so-called "effective" weights so that their
sums of squares equals the sample size. (In estimation of the variance
components, it is the squares of the weights that matter.)

Some papers claim the second option gives the least biased estimates;
others, that the third one is the way to go. Other re-weighting
options can be entertained, too. See Pfeffermann et. al. (1998) for
in-depth analysis. There were also papers by Asparouhov (the primary
programmer at Mplus) and Stapleton (multilevel researcher at UMBC),
but I am less convinced by these as they review the existing cook-book
recipes rather than try to provide a theory-based advice (unlike the
Pfeffermann's paper which does derive explicit small sample bias
expressions which are more complicated that just the sums of squares
of weights, and involve response variables, too).

I would tend to think that states are used as stratification variables
in DHS, and modeling them as random effects sounds rather odd to me. I
would specify my model as children nested in mothers nested in
clusters, and would not go above clusters that in the random part of
the model (although of course if you have state-specific variables
that are needed in your analysis, you should include them in your
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index