Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Stas Kolenikov <skolenik@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Multiple imputation with survey replicate weights |
Date | Thu, 20 Feb 2014 09:33:52 -0600 |
Stata is doing the right job in preventing you from doing dubious things. The interface of complex survey data inference and multiple imputation is surprisingly poorly studied given its ubiquity. The statistically appropriate way to combine imputation and replicate weights that I am aware of is to use the bootstrap or BRR approach; create a single imputation within each bootstrap/BRR replicate; and re-estimate your model with that replicate weight based on imputed data. See Shao and Sitter (1996; http://www.citeulike.org/user/ctacmo/article/1269394). At the moment, this requires custom programming of an estimation command that combines one imputation iteration with the command of interest. I am vaguely planning to develop a Stata Journal paper to describe the process, but it is only at the conceptualization stage now. Here's an example (not particularly stable, the combinations of -mi- and -svy- are still tricky, as they have contradicting expectations of what is known about the data, and I have to force one to ignore the other, and vice versa): webuse nhanes2brr, clear gen age2 = age*age cap pro drop mymireg program define mymireg, properties( svyb ) syntax [varlist] [if] [in] [pw iw /] , [*] * local macro `weight' contains the type * local macro `exp' contains the weight variable * local macro varlist contains the list of explanatory variables for the final regression * it is used to circumvent Stata from thinking that estimation has already been done preserve mi set wide mi register regular region1 region2 region3 rural black orace age age2 tibc tcresult mi register imputed lead zinc copper vitaminc albumin tgresult mi impute chained (pmm) lead zinc copper vitaminc albumin tgresult = region1 region2 region3 /// rural black orace age age2 tibc tcresult [pw=`exp'], add(1) mi extract 1, clear logistic highbp lead `varlist' [pw=`exp'] restore end svy brr, saving( lead_imputed_logit, replace ) : mymireg height weight age female use lead_imputed_logit, clear sum Use at your own risk. Let me repeat: USE AT YOUR OWN RISK. May be like that: use at_your_own_risk, clear? A few caveats: 1. -svy brr- will report point estimates based on a single imputation; these are useless, and would need to be discarded 2. The right coefficients and the standard errors come out of the -summarize- in the end. I used to be able to produce them with -bs4rw- followed by -estat bootstrap-, but for whatever reasons it stopped working (it used to in 2010) -- probably the internal format of what -bootstrap- expects changed, and what -bs4rw- supplies is no longer compatible with it. 3. I used the equivalence between the bootstrap and BRR; things will not work appropriately with jackknife, as it does not provide enough sampling variability, and the imputation model will be too close to that based on the full data. Hence, sampling variability in the imputation model will be insufficient, and the standard errors will be underestimated. Likewise, the compressed replicate weight variability methods (BRR with Fay's adjustment; mean bootstrap) may not be able to generate enough sampling variability in the imputation process, either. 4. As you clearly see, the code is cumbersome, and probably not particularly efficient -- I may have been able to better deal with -mi extract-, for instance, and all these -preserve-s are obviously going to eat up a good fraction of computing time with large data sets. -- Stas Kolenikov, PhD, PStat (ASA, SSC) -- Principal Survey Scientist, Abt SRBI -- Opinions stated in this email are mine only, and do not reflect the position of my employer -- http://stas.kolenikov.name On Wed, Feb 19, 2014 at 4:41 PM, Joshua Mitts <joshua.mitts@gmail.com> wrote: > Has anyone found a way to use survey replicate weights with multiply > imputed data? The svy manual states: > > mi estimate may be used with svy linearized if the estimation command > allows mi estimate; it may not be used with svy bootstrap, svy brr, > svy jackknife, or svy sdr. > > And I receive this error when trying to fit a logit model: > > vce(brr) previously set by mi svyset is not allowed with mi estimate > > Thanks very much, > Josh > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/