Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Owen Corrigan <ocorrigan@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | st: Multilevel modelling: questions about independence, residuals, balance |
Date | Mon, 31 Jan 2011 13:12:13 +0000 |
I have some general questions about Multilevel Modeling (MLM). I've been reading the Stata Press book by Rabe-Hesketh and Skrondal (2008, 2nd edn.) but there are some specifics I'm unsure about. The dataset consists of individual person-level (level-1) observations (N = 23759) over 14 level-2 units (units are countries), but the level-1 data is spread very unevenly (unbalanced) over the level-2 units, with the min. observations per group = 300, and the max obs per group = 3000 approx. The dataset is actually two merged datasets from 2006 and 2007 (independent observations though, non-longitudinal). I have covariates at both levels; the point of the entire exercise is really to test whether a hypothesised level-2 covariate is significant. 1. Diagnostics: my level-2 residuals --predict varname, reffects-- after --xtmixed-- are non-normal. Does the violation of this assumption mean that I cannot be confident in the standard errors for my key level-2 variable (which proved signficant, incidentally). I have read (Maas & Hox 2004) that this only matters for the standard errors of the random effects; but this is of no interest to me and I am only concerned with the beta/standard error for this one level-2 variable. --->QUESTION: If this variable is significant despite non-normality of level-2 residuals can I content myself with this? Or do I have reason to doubt the significance? -AND, Maas & Hox say that "Robust standard errors turn out to be more reliable than the asymptotic standard errors based on maximum likelihood" - so should I call for robust standard errors? This is only possible with GLS estimation as opposed to ML/REML; do I lose out on something by choosing this estimation method? (Gllamm is a non-runner.) -ALSO, the level-1 and level-2 residuals must be independent and non-correlated. How to test for this? If I just generate one variable containing the lvl-1 residuals and another containing the lvl-2 residuals can I just do a --pwcorr-- on the two and assess it like that? Or does this require something more fancy/complicated? 2. Balance: as I said, my dataset is really quite unbalanced; does this have any implications for inference or standard errors? Could weighting play any role here, and how and why might one go about weighting, say, for some countries which may have relatively very few cases (or is this even something I need to worry about?)? Many thanks for all and any clarification you may be able to render on any point here, small or large. Owen Corrigan. PhD student Trinity College Dublin. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/