Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Different types of missing data and MI

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: st: Different types of missing data and MI
Date	Mon, 13 Jun 2011 20:50:09 -0400

Clyde Schechter <[email protected]>:
If it's true that your covariate falling below the detection limit is
not predictable from other covariates or outcomes, then it is
apparently orthogonal and can be omitted with no effect; if that's
true, then various imputation schemes should also produce essentially
the same estimates for other coefs.  You might estimate a -tobit- of
the 1/3 missing covariate on other covariates (and possibly the
outcome) and predict based on xb and a random draw from the error
distribution; a detection limit is one of the few instances in which
-tobit- works really well in simulations.  You can also omit the
covariate; you can also replace with the detection limit, or zero, and
include a dummy for missing (all known to be problematic in some
cases, but not yours).  There are more options than these four, but if
these four produce similar results, you have a very good footnote for
whatever table makes the final cut: the results are invariant to this
choice.

On Mon, Jun 13, 2011 at 7:20 PM, Clyde Schechter
<[email protected]> wrote:
> My problem is a third kind of missing data.  One of the covariates is the
> result of a lab assay that has a lower limit of detectability.  So these
> data are not missing in the full sense, rather they are left censored at
> the lower limit of detectability (or, more properly, interval censored
> between zero and the lower limit of detectability).  I don't know what to
> do with these.  -mi- doesn't seem applicable since these are certainly not
> missing at random.  And any way I can think of to try to impute values
> here strikes me as inherently invalid because it appears that the data
> simply do not contain any information whatsoever about the relationship
> between this variable and the outcome (or anything else) in the
> undetectable range.  And I don't know of any analytic methods that handle
> interval-censored independent variables.
>
> For now, because the lower limit of detectability is close to zero, and
> because analyses and graphical explorations excluding these cases suggest
> that this variable is not associated with the outcome anyhow, I've done an
> analysis where I simply recode these particular values as zero. But I
> can't escape the feeling that this is not really defensible.
>
> There are two alternatives I would prefer.  One is to simply omit these
> cases altogether--but there are a lot of them, about a third of the
> sample, and it would leave us rather underpowered.  The other is to just
> drop this variable (especially since it doesn't seem to be associated with
> the outcome anyway, at least outside the censoring range)--but the
> variable was actually identified in our study aims as one of the key
> predictors of interest.  (I guess we weren't very prescient!)
>
> Any advice would be appreciated.  Thanks in advance

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Different types of missing data and MI
  - From: "Clyde Schechter" <[email protected]>

Prev by Date: st: Different types of missing data and MI
Next by Date: Re: st: Merge by range of values
Previous by thread: st: Different types of missing data and MI
Next by thread: Re: st: Different types of missing data and MI
Index(es):
- Date
- Thread