Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Grouping income variables- RECODE COMMAND


From   "Antonio Rodriguez Andres" <Antonio.Andres@emu.edu.tr>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Grouping income variables- RECODE COMMAND
Date   Tue, 4 Feb 2014 14:48:54 +0200

Dear Maarten

Thank you very much for your feedback. What I did is the following

http://www3.nd.edu/~rwilliam/stats2/l12.pdf

*Create income midpoints

recode hinctnt (1=900) (2=2700) (3=4800) (4=9000) (5=15000) (6=21000) (7=27000) (8= 33000) (9=48000) (10=75000) (11=105000) (12= 175200) , gen(hincome)
replace hincome=. if hinctnt==77 | hinctnt==88 |  hinctnt==99
gen lhincome=log(hincome)

**dummy indicator for missing income values

gen xhincome=hincome
replace xhincome= 29304.99 if missing(hincome)
gen md=0
replace md=1 if xhincome! =hincome

xtmixed dprt age age2 gender married separated divorced widowed eduyrs ichldhm md lhincome ihealth iuemp5yr iuemp12m rgdp06[pw=dweight] if md==0 || cntry: gender , mle


But I still got the same message, the md indicator variable is dropped. How can İ estimate the model controlling for missing values in income?

Antonio
-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Maarten Buis
Sent: Tuesday, February 04, 2014 2:26 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Grouping income variables- RECODE COMMAND

On Tue, Feb 4, 2014 at 12:29 PM, Antonio Rodriguez Andres wrote:

> First of all, I recode the household income variable using mıd-points. The problem is defining a midpoint for the open ended top category. For that purpose, I follow Hout (2004).
> *Create income midpoints
> recode hinctnt (1=900) (2=2700) (3=4800) (4=9000) (5=15000) (6=21000) 
> (7=27000) (8= 33000) (9=48000) (10=75000) (11=105000) (12= 175200) , 
> gen(hincome) replace hincome=. if hinctnt==77 | hinctnt==88 |  
> hinctnt==99  // I recode hinctnt= 77 & 88 & 99 (Don’t Know,  Refusal, No answer) as missing values gen lhincome=log(hincome) I also need to include in my regression a dummy variable for the mıssing values corresponding to income. I type in Stata.
> gen missinc=0
> replace missinc=1 if missing(hincome)
>
> When estimating the following model, the dummy variable for missing values for income is dropped but ıt has to be in my model. Is there anything wrong with the Stata code?

Two comments:

First, don't do this, this is not a good way of dealing with missing values. See e.g.
<http://www.stata.com/statalist/archive/2007-12/msg00030.html>

Second, mechanically what is going on that most estimation commands exclude all observations that include at least one missing value on any of the variables included in the model. If you exclude all observations for which lhincome is missing than  missinc will be a constant containing only 0, and will thus be excluded.

Hope this helps,
Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index