Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: imputing continuous values when respondents select categories, e.g., income category


From   Richard Williams <Richard.A.Williams.5@ND.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>, "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: imputing continuous values when respondents select categories, e.g., income category
Date   Sat, 25 Apr 2009 00:51:57 -0500

At 11:20 PM 4/24/2009, Roy Wada wrote:

ystar(a,b) will still give you censored predictions, which
may not be a good idea as Richard indicated.

Anyone knows if it's okay to use non-censored predictions from
-intreg- as a part of 2SLS and bootstrap the standard error,
assuming we have identifying instruments in the first stage?

Roy

These possibilities are starting to make my head hurt. :) To back up, I've never heard of anyone computing the predicted values from intreg and then using them as an independent variable in subsequent analyses. That may just reflect my ignorance, but it seems like at a minimum your standard errors would be too optimistic.

For that matter, there are concerns about using intreg for dependent variables - if the assumptions of the method are not met (e.g. normality) the estimates may be wrong. And, as the manual points out, for something like income, you may want to use the logged values of the interval endpoints. See the manual for an example. So, you have to be careful that your use of intreg is legit in the first place.

Remember, too, that unlike missing data fill in the blank techniques, with intreg you aren't just imputing some values, you are imputing all of them. And, if you are computing this y-hat from x1, x2 and x3, why not just use x1, x2 and x3 in your other models and leave out the y-hat? Remember, the y-hat will be perfectly correlated with x1, x2 and x3 because it is computed from them.

I'm just improvising here, but this doesn't seem like the way intreg should be used. If its assumptions are met, it can be a very nice alternative to other ordinal methods. But, trying to use the estimated values from it as independent variables seems problematic. It is like the problems you have with single imputation of missing data, but even worse since every value is being imputed.

I keep wishing Scott Long or somebody like that would write more about intreg, so if we have any experts out there on it feel free to chime in!


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index