Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Grouping income variables- RECODE COMMAND

From   Maarten Buis <>
Subject   Re: st: Grouping income variables- RECODE COMMAND
Date   Tue, 4 Feb 2014 13:25:31 +0100

On Tue, Feb 4, 2014 at 12:29 PM, Antonio Rodriguez Andres wrote:

> First of all, I recode the household income variable using mıd-points. The problem is defining a midpoint for the open ended top category. For that purpose, I follow Hout (2004).
> *Create income midpoints
> recode hinctnt (1=900) (2=2700) (3=4800) (4=9000) (5=15000) (6=21000) (7=27000) (8= 33000) (9=48000) (10=75000) (11=105000) (12= 175200) , gen(hincome)
> replace hincome=. if hinctnt==77 | hinctnt==88 |  hinctnt==99  // I recode hinctnt= 77 & 88 & 99 (Don’t Know,  Refusal, No answer) as missing values
> gen lhincome=log(hincome)
> I also need to include in my regression a dummy variable for the mıssing values corresponding to income. I type in Stata.
> gen missinc=0
> replace missinc=1 if missing(hincome)
> When estimating the following model, the dummy variable for missing values for income is dropped but ıt has to be in my model. Is there anything wrong with the Stata code?

Two comments:

First, don't do this, this is not a good way of dealing with missing
values. See e.g.

Second, mechanically what is going on that most estimation commands
exclude all observations that include at least one missing value on
any of the variables included in the model. If you exclude all
observations for which lhincome is missing than  missinc will be a
constant containing only 0, and will thus be excluded.

Hope this helps,

Maarten L. Buis
Reichpietschufer 50
10785 Berlin

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index