Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: gllamm or SVY

From   Stas Kolenikov <>
Subject   Re: st: gllamm or SVY
Date   Fri, 23 Apr 2010 08:31:24 -0500

Survey design based inference is said to be doubly consistent. First,
the parameter estimates are consistent for the finite population
parameters (i.e., as you increase the sample size, the parameters of
the model converge to the parameters of the so-called census
regressions. The latter are regressions hypothetically fitted to the
whole population. They may suffer from whole lots of problems like
heteroskedasticity, nonlinearity, whatever have you, but at least
these are well defined parameters: you take the population, you apply
a certain computational rule to it, and voila). Second, if your model
is correct, the parameter estimates converge to these model parameters
(although here you need to think about complicated asymptotics with
both population size and sample size going to infinity in some
concordant manner). See If your model is
incorrect, God only knows what you are estimating in the model-only
world; in survey statistics world, you are still estimating parameters
of the finite population, although it might be harder to interpret

When applied to the multilevel data, this means that you'd have to
make a leap of trust. In fact, many leaps: (1) that every regression
at every level is correctly specified; (2) that you modelled all the
random effects you needed; (3) that your random effects are
homoskedastic and have a good old normal distribution; (4) that you've
chosen a procedure with appropriate numeric properties (-gllamm- is
totally fine with sufficient number of integration points, but it gets
polynomially slow as you increase the number of these).

There are also conceptual/interpretational differences. It is believed
that the multilevel models kinda think about random components as
drawn from a nicely parameterized distribution, so you'd have to think
that your countries are drawn at random from a hypothetical
hyperpopulation of all possible countries, and yet their random
effects have a nice normal distribution (you can relax that with
-gllamm- that allows non-parametric distribution, but identifying
these non-parametric distributions requires large number of units at
the corresponding level). You can probably think about hospitals that
way, but I would certainly second Steve's suggestion to model
countries as fixed effects.

For further discussion, see an article by -gllamm-'s authors:

On Fri, Apr 23, 2010 at 4:57 AM, Irene Moral <> wrote:
> Dear all,
> I'm analysing a sample drawn from a stratified cluster designed study.
> In my study, I considered countries as strata, health centers in each
> country as clusters and then patients are selected within each cluster.
> I'm using the svyset and all the svy commands to define the sample design
> and to run analysis, but some people asked me about using multilevel
> analysis instead of svy.
> I have understood that other options are to use the gllamm command or to
> panel data as longitudinal and use xt commands, żit is true?
> can anybody explain in which cases is more useful to use one than the other?
> or what type of errors am I exposed to when using svy?

Stas Kolenikov, also found at
Small print: I use this email account for mailing lists only.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index