[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Imputing values for categorical data
Jennifer Wolfe Borum wrote:
> I am working with a data set composed of responses to survey questions
> which contains some categorical variables such as gender and ethnicity.
> The data has missing values and I have decided that it would be best to
> keep all observations due to a pattern in the missing values. I have
> decided to use the impute command in Stata to handle this as I've had some
> difficulty and am not familiar enough with the hotdeck and Amelia
> imputations. I've found that impute works fine for the continuous
> variables, however for the categorical variables I am obtaining values for
> which I am unsure how to interpret. For example, I will get an imputed
> value of .35621 for gender which is coded 1 or 0. Would anyone be able to
> help with the interpretation of the values I am obtaining for the
> categorical data?
I have never used -impute- before, but the values you appear to have
generated with it for your gender variable are what one might call
'pseudo-probabilities'. A value of .35621 would suggest that the
probability of the observation being of gender==1 is low.
> Also, I would be interested in knowing which approach other Stata users
> prefer for imputing values as this is the first time I have encountered
> missing values and I am just beginning to research the various methods of
I'm not sure how imputing missing values for a variable like gender is
particularly useful. If you don't know the gender for an observation, I
personally think it best to leave it as missing, rather than guess
(sometimes correctly, sometimes erroneously), as you risk undermining the
accuracy of your data.
CLIVE NICHOLAS |t: 0(44)191 222 5969
Politics |e: firstname.lastname@example.org
Newcastle University |http://www.ncl.ac.uk/geps
* For searches and help try: