Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: GLLAMM and rescaling sampling weights


From   Stas Kolenikov <skolenik@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: GLLAMM and rescaling sampling weights
Date   Thu, 23 Jun 2011 10:43:13 -0400

The advice on rescaling the weights is tangly at best. Pfeffermann et.
al. (1998, http://www.citeulike.org/user/ctacmo/article/711637) show
that the bias of the variance estimates depends on how you define your
weights. The question arises in the following situation: you have
sampling weights for PSUs, and you have separate sampling weights for
individuals within PSUs (which you don't have in DHS, so you can stop
reading here unless you are really curious about what Rabe-Hesketh and
Skrondal meant in that phrase). Then there's been some inconclusive
research into the following options:

1. leave the individual weights as they are (summing up to the size of the PSU)

2. rescale them so that they sum up to the sample size (# of
observations sampled from a PSU)

3. rescale them to the so-called "effective" weights so that their
sums of squares equals the sample size. (In estimation of the variance
components, it is the squares of the weights that matter.)

Some papers claim the second option gives the least biased estimates;
others, that the third one is the way to go. Other re-weighting
options can be entertained, too. See Pfeffermann et. al. (1998) for
in-depth analysis. There were also papers by Asparouhov (the primary
programmer at Mplus) and Stapleton (multilevel researcher at UMBC),
but I am less convinced by these as they review the existing cook-book
recipes rather than try to provide a theory-based advice (unlike the
Pfeffermann's paper which does derive explicit small sample bias
expressions which are more complicated that just the sums of squares
of weights, and involve response variables, too).

I would tend to think that states are used as stratification variables
in DHS, and modeling them as random effects sounds rather odd to me. I
would specify my model as children nested in mothers nested in
clusters, and would not go above clusters that in the random part of
the model (although of course if you have state-specific variables
that are needed in your analysis, you should include them in your
regression).

On Thu, Jun 23, 2011 at 8:42 AM, Susanna Makela
<susanna.m.makela@gmail.com> wrote:
> Dear Statalisters,
>
> Apologies in advance if this is a double posting; I sent this email a
> few days ago but haven't seen the message show up in the archives or
> the statalist digest, so I'm retrying.
>
> Some background: I am using GLLAMM to run a multilevel logistic
> regression on a round of DHS (Demographic and Health Survey) data. I
> have a three-level model with children as level 1, PSUs as level 2,
> and state as level 3. Because my unit of observation is children and
> because I am pooling data across states, I am using the national-level
> women's weights as my level-1 sampling weights (since information on
> children comes from interviewing their mothers). I am assigning
> sampling weights of 1 for the PSUs and states since GLLAMM requires
> weights to be specified for all levels of the model.
>
> The GLLAMM manual notes that the pweight option, which holds the
> sampling weights, "should be used with caution if the sampling weights
> apply to units at a lower level than the highest level in the
> multilevel model. The weights are not rescaled; rescaling is the
> responsibility of the user."
>
> My questions are:
>
> - What is the appropriate way to scale sampling weights when they
> apply to the lowest level in the model?
>
> - Do I have to adjust for the fact that, in my particular case, I am
> assigning the same weight to all children of the same mother?
>
> I've read "Multilevel modeling of complex survey data" (Rabe-Hesketh
> and Skrondal, J. R. Ststist. Soc. A 2006), but didn't quite understand
> all of it. The published papers I've found that use both DHS data and
> GLLAMM don't discuss how - if at all - they rescale the sampling
> weights.
>

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index