`[Sorry for reposting -- the first line of the original post got eaten by
``the listmonster]
`

`Summary: When I run -logit- under -mim- on a set of imputed datasets
``created using -ice-, I get a parameter estimate of 0 and a standard
``error of 0 for a continuous independent variable. However, fitting
``models for each imputed dataset individually (without -mim-) produces
``non-zero estimates with similar (to each other) magnitudes. This
``suggests either that -mim- is telling me something I don't know how to
``interpret, there is something wrong with -mim-, or there is something
``nearly-invisible wrong with the data.
`

`Longer version: I am looking at the relationship between adoption of a
``particular kind of software system by physicians and a set of
``independent variables. One of my independent variables is years of
``experience, EXPER. EXPER is continuous and approximately normally
``distributed. Unfortunately, 25% of my cases are missing on EXPER, so I
``decided to try multiple imputation using the Galati, Carlin, and Royston
``-ice- and -mim- commands available from SSC. After -ice-, the
``distribution of EXPER in the imputed datasets looks fine (that is,
``similar in shape, mean, and variance to the original), and its
``relationship to the dependent variable HASEMR looks the same. If I use
``-logit- (without -mim-) to look at the relationship between the two
``variables in the original dataset (_mj==0) and in the first imputed
``dataset (_mj==1), I get nearly identical results. (I've added
``PCTPOVERTY, a continuous variable with no missing data, to show below
``that my problem is just with EXPER.)
`
. logit hasemr exper pctPoverty if _mj==0
Iteration 0: log likelihood = -421.1634
Iteration 1: log likelihood = -398.42368
Iteration 2: log likelihood = -397.85135
Iteration 3: log likelihood = -397.84996

`Logistic regression Number of obs
``= 699
`` LR chi2(2) =
``46.63
`` Prob > chi2 =
``0.0000
``Log likelihood = -397.84996 Pseudo R2 =
``0.0554
`

`------------------------------------------------------------------------------
``
`` hasemr | Coef. Std. Err. z P>|z| [95% Conf.
``Interval]
``-------------+----------------------------------------------------------------
``
`` exper | -.0505082 .0091315 -5.53 0.000 -.0684057
``-.0326108
``pctPoverty | -.0441709 .0135119 -3.27 0.001 -.0706537
``-.0176881
`` _cons | 1.021042 .3025942 3.37 0.001 .4279685
``1.614116
``------------------------------------------------------------------------------
``
`
. logit hasemr exper pctPoverty if _mj==1
Iteration 0: log likelihood = -617.85901
Iteration 1: log likelihood = -588.0887
Iteration 2: log likelihood = -587.54423
Iteration 3: log likelihood = -587.5435

`Logistic regression Number of obs
``= 1001
`` LR chi2(2) =
``60.63
`` Prob > chi2 =
``0.0000
``Log likelihood = -587.5435 Pseudo R2 =
``0.0491
`

`------------------------------------------------------------------------------
``
`` hasemr | Coef. Std. Err. z P>|z| [95% Conf.
``Interval]
``-------------+----------------------------------------------------------------
``
`` exper | -.0492512 .0074785 -6.59 0.000 -.0639089
``-.0345936
``pctPoverty | -.0324184 .0106465 -3.04 0.002 -.0532852
``-.0115517
`` _cons | .9076523 .2400196 3.78 0.000 .4372224
``1.378082
``------------------------------------------------------------------------------
``
`

`However, this is what happens when I try to estimate the same model with
``-mim-:
`
. mim: logit hasemr exper pctPoverty

`Multiple-imputation estimates (logit) Imputations
``= 10
``Logistic regression Minimum obs
``= 1001
`` Minimum dof =
``103.2
`

`------------------------------------------------------------------------------
``
`` hasemr | Coef. Std. Err. t P>|t| [95% Conf.
``Int.] FMI
``-------------+----------------------------------------------------------------
``
`` exper | -0 0 -5.52 0.000 -0 -0
``0.285
``pctPoverty | -.036964 .011038 -3.35 0.001 -.058637 -.015291
``0.059
`` _cons | .937575 .270402 3.47 0.001 .403621 1.47153
``0.217
``------------------------------------------------------------------------------
``
`

`Obviously something is wrong. It can't be just that the uncertainty of
``the estimate is high due to the high proportion of missing data, since
``that should result in a large standard error. Note that the imputation
``procedure does produce a small number of negative and therefore
``nonsensical values for years of experience (14 total across 10 imputed
``datasets), but this problem doesn't go away when I set those to 0. Also
``note that the dependent variable HASEMR has about 8% missing in the
``original dataset.
`
Any idea what's wrong, or suggestions for diagnostics? Thanks.
--
Michael I. Lichter, Ph.D.
Research Assistant Professor & NRSA Fellow
UB Department of Family Medicine / Primary Care Research Institute
UB Clinical Center, 462 Grider Street, Buffalo, NY 14215
Office: CC 125 / Phone: 716-898-4751 / E-Mail: mlichter@buffalo.edu
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/