Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: ICE and instrumental variables


From   Stas Kolenikov <skolenik@gmail.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: ICE and instrumental variables
Date   Fri, 18 Oct 2013 09:29:59 -0500

The mean substitution will make it way worse. Multiple imputation
followed by IV is sort of defensible from the general perspective of
missing data, in the sense that it was not shown to be wrong (although
I can be mistaken -- I am not subscribed to Econometrica or J of
Econometrics or anything like that; if David Drukker is aware of any
such references, it would be a perfect time to chime in :) ). All
other methods you have been contemplating ARE wrong, and that's very
easy to demonstrate, either by a very rudimentary simulation, or by
looking in any missing data book.

-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Senior Survey Statistician, Abt SRBI
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name



On Fri, Oct 18, 2013 at 9:17 AM, James Bernard <jamesstatalist@gmail.com> wrote:
> Thanks Stas,
>
> This was very helpful.
>
> Then, may I ask even a more basic question:
>
> The more rudimentary method is to replace the missing values with
> average.And, then to use Ivreg/2SLS etc.
>
> Is this also too problematic?
>
> As you said, the use of ivreg with imputation is problematic. Would
> the mean-substitution make it worse.
>
> Given all these costs, woudl it be better to forget imputation and
> mean-replacement altogether?
>
> Thanks again,
>
> James
>
>
>
> On Fri, Oct 18, 2013 at 10:06 PM, Stas Kolenikov <skolenik@gmail.com> wrote:
>> Nope, this approach won't make sense, as it will bias both the point
>> estimates and the standard errors. And if you think it does, you just
>> need to read more to understand WHY multiple imputation works.
>> Essentially, what you have proposed is the (conditional) mean
>> imputation, which squeezes variability... which of course is great if
>> you want to raise your reported R^2 from 0.1 to 0.6, right?
>>
>> What you need to do is to follow the mainstream workflow of multiple
>> imputation: impute a bunch of data sets (M=5 is a funny number that is
>> only good for estimation of the means; computer power is not a scarce
>> resource it was in 1970s when Don Rubin introduced multiple
>> imputation), run -ivregress 2sls- on each, and combine them with -mi
>> estimate, cmdok-.
>>
>> There is little research about the properties of IV estimates under
>> multiple imputations, as economists basically don't trust this heavily
>> model driven method, and the clash with a less heavily parametric
>> IV/GMM paradigm is inevitable, but assuming that multiple imputation
>> captures the first two moments of the full distribution of the data
>> properly, you should still have the asymptotic properties of IV to
>> hold. Of course, the tests like weak instruments would require another
>> scratch on the forehead.
>>
>> -- Stas Kolenikov, PhD, PStat (ASA, SSC)
>> -- Senior Survey Statistician, Abt SRBI
>> -- Opinions stated in this email are mine only, and do not reflect the
>> position of my employer
>> -- http://stas.kolenikov.name
>>
>>
>>
>> On Fri, Oct 18, 2013 at 8:46 AM, James Bernard <jamesstatalist@gmail.com> wrote:
>>> Hi all,
>>>
>>> I am newbie to imputation. I have been trying -ice- for imputation.
>>> What I realized is that -ice- is just the imputation part. The imputed
>>> part them can be used for estimation with -mi- (count models, etc).
>>>
>>> I browsed through the list server and found out that people were
>>> confused if they can use -mi-estimation for instrumental variables
>>> (2SLS).
>>>
>>> It would be good if we can use ivreg with -mi-.
>>>
>>> If not, would this approach make any sense:
>>>
>>> 1- impute the data using ice (e.g., M=5)
>>> 2-take the average of the imputed cells and replace the missing with
>>> that average.
>>> 3- use the dataset with usual "ivreg"/2SLS commands
>>>
>>> Thanks in advance,
>>>
>>> James
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index