[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Hotdeck imputation
I need to do a relatively simple imputation, but am having trouble following
the examples given.
Here is the situation:
Dataset ~ 10,000 obs (non-weighted, 1 obs/subject)
Variable to be imputed:
EKG_abnormal --binary(yes/no), missing at random < 5% of observations.
Potential predictors with which to impute:
At least five, some binary (e.g. chestpain yes/no, first_cat (1-5), etc.)
some which are continuous but can be made categorical (e.g. age ==> age_cat)
Primary outcome being studied: Death yes/no
(1) Should I use the outcome variable (death) as one of imputation
variables? Should I use many imputation variables since I can (large
(2) Most important: Can somebody give an example for the correct way to
issue the commands?
If I do the following:
. hotdeck ekg_abnormal using imp, by(agecat first_cat) store
Then I end up with 5 files, imp1 imp2 imp3 imp4 imp5
Eventually I want to end up with imputed values for ekg_abnormal that I can
use the main logistic regression equation of interest. Not sure where the
options infile(), command(logit) fit into things.
Any help would be greatly appreciated.
* For searches and help try: