Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: trouble with -mi predict- in Stata 12

From   Omar Badawi <>
Subject   Re: st: trouble with -mi predict- in Stata 12
Date   Tue, 14 Aug 2012 17:27:19 -0400

Please ignore my original request... looks like there was an oversight
in my coding causing some extra patients to be included in this
calculation. I now have it sorted out. Sorry for the false alarm!



On Mon, Aug 13, 2012 at 10:38 AM, Omar Badawi <> wrote:
> Hi All,
> I have been trying to generate predicted probabilities from a logistic
> regression model using -mi estimate- in Stata 12 and was hoping to get
> some insight into why my results do not seem correct. Here is the
> background:
> I have a dataset with patients who have an initial observation and one
> for each follow-up period, up to 4 total observations. I performed a
> multiple imputation using -mi impute chained (regress)- to generate 5
> imputed datasets. I then converted to wide format using -mi convert
> wide- and performed some data preparation for a logistic regression
> model.
> My logistic regression model has the following format:
> . mi estimate, saving(miestimates, replace): logit y x1 x2 etc....
> I then applied the following commands as described in the Stata 12 user guide:
> . mi predict xb_mi using miestimates
> . qui mi xeq: generate phat = invlogit(xb_mi)
> When I try to examine the actual to predicted number of events across
> various categories, my results do not appear to make sense. For
> example, with the following command, where the variable 'category' has
> 5 different categories, my predictions are 2-3 fold higher than the
> actual number of events in every category. This seems true across
> different categorical variables.
> . mi estimate : total phat y, over(category)
> I also tried the following with the same results:
> . mi xeq 1: total phat y, over(category)
> I don't believe my model can possibly over-predict by 2-3 fold for
> everyone because I also tested calibration on each individual imputed
> dataset using -mi convert flongsep- and running a Hosmer-Lemeshow GOF
> test on each of the 5 datasets. When I did that, I get excellent
> calibration across the 10 deciles of risk. Also, when I generate a
> model using complete case analysis instead of using multiple
> imputation, the coefficients are similar and the calibration is
> excellent so I'm fairly confident the issue is not dramatic
> over-prediction of the model.
> If anybody has any suggestions on where I might be going wrong or how
> to troubleshoot, I would really appreciate it. Thanks in advance for
> the help!
> sincerely,
> Omar Badawi
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index