Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: the impute command

From   "Nick Cox" <>
To   <>
Subject   RE: st: RE: the impute command
Date   Wed, 19 May 2004 17:40:56 +0100

. replace y = round(y) 

is an alternative to -recode- here. 

On the broader issue of how to impute, 
every now and again a debate breaks out 
on Statalist on different ways of imputing. 
A search of the archives would find several 
very interesting posts. The executive 
summary appears to be

1. There are many ways of doing it. (The 
best way is to do the measurement again, 
difficult or impossible in many cases!)  

2. It is not at all clear that -impute- 
is overall a very good way of doing it. 

3. Various users have implemented some 
better ways to do it. 

4. Official Stata's support for good 
ways of doing it is not that great, right 
now. Information on what StataCorp will
add in the near future follows this colon: 

(no, sorry, it's missing; impute your own


> -----Original Message-----
> From:
> []On Behalf Of Richard
> Williams
> Sent: 19 May 2004 17:01
> To:
> Subject: Re: st: RE: the impute command
> >I think that the impute command works only for continous 
> variables. If
> >you want to base your imputation on the determinsitic regression
> >technique, you would perhaps need to use  a -probit- command.
> >
> >Shqiponja
> It will work with a dichotomy, but it may not be optimal.  An 
> alternative 
> would be something like
> quietly logit y x1 x2
> predict yhat
> recode yhat (0/.50 = 0)(.50/1 = 1)
> replace y = yhat if missing(y)
> This could get tedious though if you have a bunch of Xs 
> several of which 
> have missing data. For example, if you said
> impute y x1 x2 x3 x4 x5 x6 x7
> it would run one regression for those who had complete data 
> on all the xs; 
> then it would run another regression for those that had md on one x; 
> etc.  So, to manually do the equivalent of what impute is 
> doing could be 
> incredibly time consuming.
> This raises an interesting question:  I wonder how tough it 
> would be to 
> create a variation of the impute command using logit or 
> probit instead of 
> regress?
> Also, is any impute method for a dichotomy necessarily the 
> right way to 
> go?  For example, if you have relatively rare or very 
> frequent events, all 
> of the imputed values could wind up being 0 or 1.  From what 
> I've read, the 
> use of impute is debatable even for OLS regression, and my 
> guess is that it 
> would be even more questionable for a dichotomy.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index