Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Jordan H <jihool3670@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: surprising (at least to me) behavior when using -predict- after -mim- |

Date |
Thu, 24 Feb 2011 15:43:55 -0500 |

Dear all, Suppose I have a data set with missing data and as such, I have used multiple imputation to create 2 imputed data sets. As per the documentation for -mim-, my data set is set up as follows: _mj _mi y x 0 1 1.1 100.1 0 2 9.2 . 0 3 3.7 . 1 1 1.1 100.1 1 2 9.2 105.3 1 3 3.7 110.9 2 1 1.1 100.1 2 2 9.2 104.8 2 3 3.7 111.3 I have run -- mim: logit y x -- to fit a model and combine the estimates across imputations. When I subsequently run -- mim: predict y_predicted --, STATA returns predicted probabilities for those observations that have missing data i.e. _mj _mi y x y_predicted 0 1 0 100.1 0.39 0 2 0 . 0.25 0 3 1 . 0.56 1 1 0 100.1 0.39 1 2 0 105.3 0.71 1 3 1 110.9 0.87 2 1 0 100.1 0.39 2 2 0 104.8 0.73 2 3 1 111.3 0.86 How is it producing predicted probabilities when there is missing data? Running -predict- after fitting a logit model produces . for observations with missing data due to case-wise deletion. I've gone through the documentation...what am I missing? Related question: I have also have a test dataset which is formatted the same as above ie. with both the original, non-imputed data in the same file as the imputed data. Once I have fit a model on the training dataset, I would like to analyze its predictive capabilities by predicting from the observations in the test dataset and looking at things like sensitivity/specificity/etc. My question here is, what predicted probabilities should I be concerned with? Should I be concerned with how well the model predicts the un-imputed data? Or should I just worry about how well it predicts the data that has been imputed? Thanks so much for the consideration! Jordan --- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Wald test in Random Coefficient Model** - Next by Date:
**Re: st: Wald test in Random Coefficient Model** - Previous by thread:
**st: Wald test in Random Coefficient Model** - Next by thread:
**st: how to use sample weights when pooling cross-sectional surveys** - Index(es):