# RE: st: RE: Hotdeck problem

```Joelle,

It worked for us. I cannot see the problem without the data, but you could try largering the by-set. For example, instead of using age, you could create a new variable that represents range of age, let's say x3=1 if age<=18 x3=2 if age>18 & age<=25. and so on.

Good luck, Rodrigo.

Rodrigo,

Thank you for your response. I tried your suggested technique, and came up with the same results. The issue must lie within my variables...

Joelle

> Joelle,
>
> It seems that you have missings in the by-variables. Consider the
> following sentence: hotdeck y, store by(x1 x2) keep(id). You will have
> troubles if x1 or x2 have missing, we "solved" the problem using the -9.
> Suppose x1 = {1, 2, 3} and x2={1, 2, 3, 4, 5}, then we created x1=-9
> if x1==. and x2=-9 if x2==., we hotdeck in that way (with the news x1
> and
> x2) and then we put as missings the cases where x1 or x2 ==-9. We did
> that with a simple loop.
>
> Rodrigo.
>
>
>
>
>
>
> For my thesis, I am using the hotdeck program to impute values for
> missing cases in my income variable. Currently, I am trying to hotdeck
> my income variable (176 missing) using 3 variables (age=27 missing;
> education=13 missing; gender=0 missing; although with 9 overlapping
> missing values the combination of these three variables only has 31
> missing cases total). Yet when Stata creates my new, hotdecked income
> variable, there are an additional 19 missing cases that I can't
> account for (missing=50). Does anyone know why this might be? Another
> strange thing is that, when I try to rename my hotdecked income
> measure before merging it with my full dataset, all 50 missing cases
> remain missing after merging; when I do not rename my hotdecked income
> measure before merging, only 42 missing cases remain missing after
> merging. I have pasted my Stata output below. Any help would be greatly appreciated!
>
> Joelle Anderson
> Graduate Student, Sociology
> University of Wisconsin-Milwaukee
> [email protected]
>
> //First hotdeck imputation, renaming the income variable BEFORE
> merging with full dataset
>
> . hotdeck incomeR using IncomeHD, store by(education sex ageR)
> keep(resp
> incomeR)
> DELETING all matrices....
>
> Table of the Missing data patterns
>  * signifies missing and - is not missing
>
> Varlist order: incomeR
>
>     pattern |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           * |        176       11.72       11.72
>           - |      1,326       88.28      100.00
> ------------+-----------------------------------
>       Total |      1,502      100.00
> WARNING: When the <command> option is not selected then no analysis is
> performed on the imputed datasets
>
>
> . use "C:\data\IncomeHD1.dta", clear
>
> . tab incomeR
>
>   RECODE of |
>      income |
>    (income. |
>  last year, |
>  that is in |
>  2004, what |
>    was your |
> total famil |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |        103        7.09        7.09
>           2 |        164       11.29       18.39
>           3 |        222       15.29       33.68
>           4 |        148       10.19       43.87
>           5 |        162       11.16       55.03
>           6 |        265       18.25       73.28
>           7 |        178       12.26       85.54
>           8 |        124        8.54       94.08
>           9 |         86        5.92      100.00
> ------------+-----------------------------------
>       Total |      1,452      100.00
>
> . rename incomeR incomez
>
> . merge resp using "C:\Documents and Settings\anders35\My
> Documents\Thesis_3_29.dta", unique sort
>
> . tab incomez
>
>   RECODE of |
>      income |
>    (income. |
>  last year, |
>  that is in |
>  2004, what |
>    was your |
> total famil |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |        103        7.09        7.09
>           2 |        164       11.29       18.39
>           3 |        222       15.29       33.68
>           4 |        148       10.19       43.87
>           5 |        162       11.16       55.03
>           6 |        265       18.25       73.28
>           7 |        178       12.26       85.54
>           8 |        124        8.54       94.08
>           9 |         86        5.92      100.00
> ------------+-----------------------------------
>       Total |      1,452      100.00
>
> //Second hotdeck imputation, renaming the hotdecked income variable
> AFTER merging with full dataset
>
> . hotdeck incomeR using IncomeHotD, store by(education sex ageR)
> keep(resp incomeR) DELETING all matrices....
>
> Table of the Missing data patterns
>  * signifies missing and - is not missing
>
> Varlist order: incomeR
>
>     pattern |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           * |        176       11.72       11.72
>           - |      1,326       88.28      100.00
> ------------+-----------------------------------
>       Total |      1,502      100.00
> WARNING: When the <command> option is not selected then no analysis is
> performed on the imputed datasets
>
> . clear
>
> . use "C:\data\IncomeHotD1.dta", clear
>
> . tab incomeR
>
>   RECODE of |
>      income |
>    (income. |
>  last year, |
>  that is in |
>  2004, what |
>    was your |
> total famil |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |         98        6.75        6.75
>           2 |        162       11.16       17.91
>           3 |        220       15.15       33.06
>           4 |        153       10.54       43.60
>           5 |        159       10.95       54.55
>           6 |        267       18.39       72.93
>           7 |        178       12.26       85.19
>           8 |        126        8.68       93.87
>           9 |         89        6.13      100.00
> ------------+-----------------------------------
>       Total |      1,452      100.00
>
> . merge resp using "C:\Documents and Settings\anders35\My
> Documents\Thesis_3_29.dta", unique sort
>
> . rename incomeR incomey
>
> . tab incomey
>
>   RECODE of |
>      income |
>    (income. |
>  last year, |
>  that is in |
>  2004, what |
>    was your |
> total famil |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |         98        6.71        6.71
>           2 |        164       11.23       17.95
>           3 |        222       15.21       33.15
>           4 |        153       10.48       43.63
>           5 |        159       10.89       54.52
>           6 |        270       18.49       73.01
>           7 |        178       12.19       85.21
>           8 |        126        8.63       93.84
>           9 |         90        6.16      100.00
> ------------+-----------------------------------
>       Total |      1,460      100.00
>
```

