[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <[email protected]> |

To |
<[email protected]> |

Subject |
st: RE: RE: Imputing values for categorical data |

Date |
Thu, 15 Apr 2004 21:03:00 +0100 |

```
impute the missing reference here. In this case, it
happens to be a book I know about.
(In other cases, in other postings, just giving
author surnames and dates makes the reference search
difficult: list members please note.)
Statistical Analysis With Missing Data, Second Edition
Roderick J. A. Little, Donald B. Rubin
ISBN: 0-471-18386-5 Wiley
September 2002
Nick
[email protected]
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Dupont,
> William
> Sent: 15 April 2004 20:47
> To: [email protected]
> Subject: st: RE: Imputing values for categorical data
>
>
> Jennifer
>
> In my opinion, imputation makes the most sense when we wish to adjust
> for confounding variables. Suppose that I am primarily interested in
> the relationship between y and x, and I have complete data on
> these two
> variables from my data set. I feel, however, that I should adjust my
> analysis for a number of other confounding covariates and I know that
> missing values are scattered throughout these covariates. If I just
> regress y against x and these other covariates I get a complete case
> analysis: any record that is missing any value of these covariates is
> dropped from the analysis. This can lead to a substantial
> loss of power
> and has the potential to induce bias if having complete data
> is related
> to the response of interest. Suppose that one of my confounding
> variables is gender. If I have a number of records where y and x are
> known but gender is not, it does not seem sensible to throw out this
> information just because I would like to adjust my estimates
> for gender.
> If, however, I impute gender I can avoid loosing these data.
> As long as
> gender is only in the model as a confounder, I don't see that it does
> much harm to have an imputed value of say .2 for some patient, which
> means that based on her other covariates that she is 5 times
> more likely
> to be of one gender than the other.
>
> A tricky problem with imputation is that we often lack assurance that
> the missing values are missing at random. However, even in this
> situation, it is unclear that the complete case analysis is
> superior to
> an imputed analysis for the situation described above. Imputation
> becomes much more problematic when some variables of primary interest
> have missing values.
>
> The imputation gurus do not like the single conditional imputation
> provided by Stata (see for example Little and Rubin 2002). This is
> because this technique underestimates the standard error of the
> regression coefficient for covariates with imputed values and
> overestimates the degrees of freedom. Multiple imputation methods get
> around this problem and are fine as long as you are confident that the
> missing values are missing at random. If your are only using
> imputation
> for confounding variables I'm not convinced that it makes much
> difference how you do the imputation. However, multiple imputation is
> always theoretically preferable and can avoid hassles in the
> event that
> you come up against a referee who objects to all use of single
> conditional imputation.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```

- Prev by Date:
**st: RE: RE: Imputing values for categorical data** - Next by Date:
**st: Questions which probably won't get much of an answer** - Previous by thread:
**st: RE: RE: Imputing values for categorical data** - Next by thread:
**st: Questions which probably won't get much of an answer** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |