Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Imputing values for categorical data

From   "Clive Nicholas" <>
Subject   Re: st: Imputing values for categorical data
Date   Fri, 9 Apr 2004 01:45:08 +0100 (BST)

Jennifer Wolfe Borum wrote:

> I am working with a data set composed of responses to survey questions
> which contains some categorical variables such as gender and ethnicity.
> The data has missing values and I have decided that it would be best to
> keep all observations due to a pattern in the missing values. I have
> decided to use the impute command in Stata to handle this as I've had some
> difficulty and am not familiar enough with the hotdeck and Amelia
> imputations. I've found that impute works fine for the continuous
> variables, however for the categorical variables I am obtaining values for
> which I am unsure how to interpret. For example, I will get an imputed
> value of .35621 for gender which is coded 1 or 0. Would anyone be able to
> help with the interpretation of the values I am obtaining for the
> categorical data?

I have never used -impute- before, but the values you appear to have
generated with it for your gender variable are what one might call
'pseudo-probabilities'. A value of .35621 would suggest that the
probability of the observation being of gender==1 is low.

> Also, I would be interested in knowing which approach other Stata users
> prefer for imputing values as this is the first time I have encountered
> missing values and I am just beginning to research the various methods of
> imputation.

I'm not sure how imputing missing values for a variable like gender is
particularly useful. If you don't know the gender for an observation, I
personally think it best to leave it as missing, rather than guess
(sometimes correctly, sometimes erroneously), as you risk undermining the
accuracy of your data.

CLIVE NICHOLAS        |t: 0(44)191 222 5969
Politics              |e:
Newcastle University  |
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index