Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Different types of missing data and MI

From   Austin Nichols <>
Subject   Re: st: Different types of missing data and MI
Date   Mon, 13 Jun 2011 20:50:09 -0400

Clyde Schechter <>:
If it's true that your covariate falling below the detection limit is
not predictable from other covariates or outcomes, then it is
apparently orthogonal and can be omitted with no effect; if that's
true, then various imputation schemes should also produce essentially
the same estimates for other coefs.  You might estimate a -tobit- of
the 1/3 missing covariate on other covariates (and possibly the
outcome) and predict based on xb and a random draw from the error
distribution; a detection limit is one of the few instances in which
-tobit- works really well in simulations.  You can also omit the
covariate; you can also replace with the detection limit, or zero, and
include a dummy for missing (all known to be problematic in some
cases, but not yours).  There are more options than these four, but if
these four produce similar results, you have a very good footnote for
whatever table makes the final cut: the results are invariant to this

On Mon, Jun 13, 2011 at 7:20 PM, Clyde Schechter
<> wrote:
> My problem is a third kind of missing data.  One of the covariates is the
> result of a lab assay that has a lower limit of detectability.  So these
> data are not missing in the full sense, rather they are left censored at
> the lower limit of detectability (or, more properly, interval censored
> between zero and the lower limit of detectability).  I don't know what to
> do with these.  -mi- doesn't seem applicable since these are certainly not
> missing at random.  And any way I can think of to try to impute values
> here strikes me as inherently invalid because it appears that the data
> simply do not contain any information whatsoever about the relationship
> between this variable and the outcome (or anything else) in the
> undetectable range.  And I don't know of any analytic methods that handle
> interval-censored independent variables.
> For now, because the lower limit of detectability is close to zero, and
> because analyses and graphical explorations excluding these cases suggest
> that this variable is not associated with the outcome anyhow, I've done an
> analysis where I simply recode these particular values as zero. But I
> can't escape the feeling that this is not really defensible.
> There are two alternatives I would prefer.  One is to simply omit these
> cases altogether--but there are a lot of them, about a third of the
> sample, and it would leave us rather underpowered.  The other is to just
> drop this variable (especially since it doesn't seem to be associated with
> the outcome anyway, at least outside the censoring range)--but the
> variable was actually identified in our study aims as one of the key
> predictors of interest.  (I guess we weren't very prescient!)
> Any advice would be appreciated.  Thanks in advance

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index