It seems that you have missings in the by-variables. Consider the
following sentence: hotdeck y, store by(x1 x2) keep(id). You will have
troubles if x1 or x2 have missing, we "solved" the problem using the -9.
Suppose x1 = {1, 2, 3} and x2={1, 2, 3, 4, 5}, then we created x1=-9 if
x1==. and x2=-9 if x2==., we hotdeck in that way (with the news x1 and
x2) and then we put as missings the cases where x1 or x2 ==-9. We did
that with a simple loop.

Rodrigo.

For my thesis, I am using the hotdeck program to impute values for
missing cases in my income variable. Currently, I am trying to hotdeck
my income variable (176 missing) using 3 variables (age=27 missing;
education=13 missing; gender=0 missing; although with 9 overlapping
missing values the combination of these three variables only has 31
missing cases total). Yet when Stata creates my new, hotdecked income
variable, there are an additional 19 missing cases that I can't account
for (missing=50). Does anyone know why this might be? Another strange
thing is that, when I try to rename my hotdecked income measure before
merging it with my full dataset, all 50 missing cases remain missing
after merging; when I do not rename my hotdecked income measure before
merging, only 42 missing cases remain missing after merging. I have
pasted my Stata output below. Any help would be greatly appreciated!

Joelle Anderson
University of Wisconsin-Milwaukee
anders35@uwm.edu

//First hotdeck imputation, renaming the income variable BEFORE merging
with full dataset

. hotdeck incomeR using IncomeHD, store by(education sex ageR) keep(resp
incomeR)
DELETING all matrices....

Table of the Missing data patterns
* signifies missing and - is not missing

Varlist order: incomeR

pattern |      Freq.     Percent        Cum.
------------+-----------------------------------
* |        176       11.72       11.72
- |      1,326       88.28      100.00
------------+-----------------------------------
Total |      1,502      100.00
WARNING: When the <command> option is not selected then no analysis is
performed on the imputed datasets

. use "C:\data\IncomeHD1.dta", clear

. tab incomeR

RECODE of |
income |
(income. |
last year, |
that is in |
2004, what |
total famil |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |        103        7.09        7.09
2 |        164       11.29       18.39
3 |        222       15.29       33.68
4 |        148       10.19       43.87
5 |        162       11.16       55.03
6 |        265       18.25       73.28
7 |        178       12.26       85.54
8 |        124        8.54       94.08
9 |         86        5.92      100.00
------------+-----------------------------------
Total |      1,452      100.00

. rename incomeR incomez

. merge resp using "C:\Documents and Settings\anders35\My
Documents\Thesis_3_29.dta", unique sort

. tab incomez

RECODE of |
income |
(income. |
last year, |
that is in |
2004, what |
total famil |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |        103        7.09        7.09
2 |        164       11.29       18.39
3 |        222       15.29       33.68
4 |        148       10.19       43.87
5 |        162       11.16       55.03
6 |        265       18.25       73.28
7 |        178       12.26       85.54
8 |        124        8.54       94.08
9 |         86        5.92      100.00
------------+-----------------------------------
Total |      1,452      100.00

//Second hotdeck imputation, renaming the hotdecked income variable
AFTER merging with full dataset

. hotdeck incomeR using IncomeHotD, store by(education sex ageR)
keep(resp incomeR) DELETING all matrices....

Table of the Missing data patterns
* signifies missing and - is not missing

Varlist order: incomeR

pattern |      Freq.     Percent        Cum.
------------+-----------------------------------
* |        176       11.72       11.72
- |      1,326       88.28      100.00
------------+-----------------------------------
Total |      1,502      100.00
WARNING: When the <command> option is not selected then no analysis is
performed on the imputed datasets

. clear

. use "C:\data\IncomeHotD1.dta", clear

. tab incomeR

RECODE of |
income |
(income. |
last year, |
that is in |
2004, what |
total famil |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         98        6.75        6.75
2 |        162       11.16       17.91
3 |        220       15.15       33.06
4 |        153       10.54       43.60
5 |        159       10.95       54.55
6 |        267       18.39       72.93
7 |        178       12.26       85.19
8 |        126        8.68       93.87
9 |         89        6.13      100.00
------------+-----------------------------------
Total |      1,452      100.00

. merge resp using "C:\Documents and Settings\anders35\My
Documents\Thesis_3_29.dta", unique sort

. rename incomeR incomey

. tab incomey

RECODE of |
income |
(income. |
last year, |
that is in |
2004, what |
total famil |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         98        6.71        6.71
2 |        164       11.23       17.95
3 |        222       15.21       33.15
4 |        153       10.48       43.63
5 |        159       10.89       54.52
6 |        270       18.49       73.01
7 |        178       12.19       85.21
8 |        126        8.63       93.84
9 |         90        6.16      100.00
------------+-----------------------------------
Total |      1,460      100.00

```