Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Hotdeck command


From   "Jennifer Wheeler" <jwheele1@tulane.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Hotdeck command
Date   Mon, 22 Mar 2004 12:51:40 -0600

Esteemed Statalist users:

I am a new Stata user and have been experimenting with the hotdeck command.
I am attempting to impute values for line non-response as well as item
non-response and have come across some difficulties in setting the seed so I
can reproduce my results.  The sample is stratified by reg and spec.

My first question goes as follows:  is there a way to use hotdeck to impute
for both item and line non-response?

I have started out by creating a variable ("impute") that identifies all
cases for which to impute a line of data in a series of four variables
(i.e., the cases have missing values for each of the four variables.) I then
run a hotdeck command to impute:

hotdeck type size number setting, store by (reg spec) keep (id_item) imp(1)
seed (123456789)


This procedure seems to be imputing new values for each variable if any one
variable is missing (so it is imputing a new line of data for a case
requiring only item imputation).  I want to be able to conserve the
information I have for those cases that are missing some (and not all) items
and only impute for the variables that are missing.  To reconcile this I
attempted a second step where I ran hotdeck again on the original variables
individually for only those cases that are not complete lines of missing
data.

hotdeck type if impute==0, store by (reg spec) keep (id_item) imp(1) seed
(123456789)

hotdeck size impute==0, store by (reg spec) keep (id_item) imp(1) seed
(123456789)

etc.

I used the values generated from the first step for the line non-response
and the values from the second step for the item non-response.  Upon
examination of these variables it appears like the procedure yields the same
imputed variables each time the program is run.  However, when attempting to
calculate a weighted estimate of the "number" variable for the population -- 
I get different results each time.  Is there a way to correct this?

Is there a more straight-forward way to use hotdeck imputation for item and
line non-response?
Any suggestions would be greatly appreciated.

J Wheeler

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index