Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Multiple imputation with survey replicate weights


From   Joshua Mitts <[email protected]>
To   [email protected]
Subject   Re: st: Multiple imputation with survey replicate weights
Date   Thu, 20 Feb 2014 13:48:56 -0500

Thanks very much.

Josh

On Thu, Feb 20, 2014 at 10:33 AM, Stas Kolenikov <[email protected]> wrote:
> Stata is doing the right job in preventing you from doing dubious
> things. The interface of complex survey data inference and multiple
> imputation is surprisingly poorly studied given its ubiquity. The
> statistically appropriate way to combine imputation and replicate
> weights that I am aware of is to use the bootstrap or BRR approach;
> create a single imputation within each bootstrap/BRR replicate; and
> re-estimate your model with that replicate weight based on imputed
> data. See Shao and Sitter (1996;
> http://www.citeulike.org/user/ctacmo/article/1269394). At the moment,
> this requires custom programming of an estimation command that
> combines one imputation iteration with the command of interest. I am
> vaguely planning to develop a Stata Journal paper to describe the
> process, but it is only at the conceptualization stage now. Here's an
> example (not particularly stable, the combinations of -mi- and -svy-
> are still tricky, as they have contradicting expectations of what is
> known about the data, and I have to force one to ignore the other, and
> vice versa):
>
> webuse nhanes2brr, clear
> gen age2 = age*age
> cap pro drop mymireg
> program define mymireg, properties( svyb )
> syntax [varlist] [if] [in] [pw iw /] , [*]
>   * local macro `weight' contains the type
>   * local macro `exp' contains the weight variable
>   * local macro varlist contains the list of explanatory variables for
> the final regression
>   * it is used to circumvent Stata from thinking that estimation has
> already been done
>   preserve
>   mi set wide
>   mi register regular region1 region2 region3 rural black orace age
> age2 tibc tcresult
>   mi register imputed lead zinc copper vitaminc albumin tgresult
>   mi impute chained (pmm) lead zinc copper vitaminc albumin tgresult =
> region1 region2 region3 ///
>      rural black  orace age age2 tibc tcresult [pw=`exp'], add(1)
>   mi extract 1, clear
>   logistic highbp lead `varlist' [pw=`exp']
>   restore
> end
> svy brr, saving( lead_imputed_logit, replace ) : mymireg height weight
> age female
> use lead_imputed_logit, clear
> sum
>
> Use at your own risk. Let me repeat: USE AT YOUR OWN RISK. May be like that:
>
> use at_your_own_risk, clear?
>
> A few caveats:
> 1. -svy brr- will report point estimates based on a single imputation;
> these are useless, and would need to be discarded
> 2. The right coefficients and the standard errors come out of the
> -summarize- in the end. I used to be able to produce them with -bs4rw-
> followed by -estat bootstrap-, but for whatever reasons it stopped
> working (it used to in 2010) -- probably the internal format of what
> -bootstrap- expects changed, and what -bs4rw- supplies is no longer
> compatible with it.
> 3. I used the equivalence between the bootstrap and BRR; things will
> not work appropriately with jackknife, as it does not provide enough
> sampling variability, and the imputation model will be too close to
> that based on the full data. Hence, sampling variability in the
> imputation model will be insufficient, and the standard errors will be
> underestimated. Likewise, the compressed replicate weight variability
> methods (BRR with Fay's adjustment; mean bootstrap) may not be able to
> generate enough sampling variability in the imputation process,
> either.
> 4. As you clearly see, the code is cumbersome, and probably not
> particularly efficient -- I may have been able to better deal with -mi
> extract-, for instance, and all these -preserve-s are obviously going
> to eat up a good fraction of computing time with large data sets.
>
> -- Stas Kolenikov, PhD, PStat (ASA, SSC)
> -- Principal Survey Scientist, Abt SRBI
> -- Opinions stated in this email are mine only, and do not reflect the
> position of my employer
> -- http://stas.kolenikov.name
>
>
>
> On Wed, Feb 19, 2014 at 4:41 PM, Joshua Mitts <[email protected]> wrote:
>> Has anyone found a way to use survey replicate weights with multiply
>> imputed data?  The svy manual states:
>>
>> mi estimate may be used with svy linearized if the estimation command
>> allows mi estimate; it may not be used with svy bootstrap, svy brr,
>> svy jackknife, or svy sdr.
>>
>> And I receive this error when trying to fit a logit model:
>>
>> vce(brr) previously set by mi svyset is not allowed with mi estimate
>>
>> Thanks very much,
>> Josh
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index