Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: RE: AW: how to reconstruct a minimum acceptable income from a set of binary variables?


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   AW: st: RE: AW: how to reconstruct a minimum acceptable income from a set of binary variables?
Date   Mon, 2 Mar 2009 18:00:30 +0100

<> 

On your method, I would say this is the best way for a beginner to think
about such a problem and see how logically values get replaced based on some
condition. You can omit the "==1" part, btw.

On the last question, M. Buis gave a speech in Chicago which touched upon
this subject. See http://www.stata.com/meeting/snasug08/buis_MLBsimulate.zip
and exectute -view mlbsnasug08.smcl- to start the presentation...



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Ekaterina
Hertog
Gesendet: Montag, 2. März 2009 17:50
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: RE: AW: how to reconstruct a minimum acceptable income from
a set of binary variables?

Dear all,
Thank you very much for all the comments.

The real structure of my data is as I stated originally, rather than as 
Martin described it:
e.g.
input id mindesinc_500_999 mindesinc_1000_1499  mindesinc_1500_1999
101 1 1 1 1
102 0 1 1 1
103 0 0 1 1
104 0 0 1 1
105 0 1 1 1

Basically once the person finds a certain income acceptable he or she 
finds every income above acceptable too and puts 1s, rather than 0s.
At the moment I used one of the earlier advices I got with a bit of a 
modification
gen mindesinc=0
replace mindesinc=7000 if  desired_income_above_7000==1
replace mindesinc=6000 if  desired_income_6000_6999==1
replace mindesinc=5000 if  desired_income_5000_5999==1
replace mindesinc=4500 if  desired_income_4500_4999==1
replace mindesinc=4000 if  desired_income_4000_4499==1
replace mindesinc=3500 if  desired_income_3500_3999==1
replace mindesinc=3000 if   desired_income_3000_3499==1
replace mindesinc=2500 if   desired_income_2500_2999==1
replace mindesinc=2000 if  desired_income_2000_2499==1
replace mindesinc=1500 if  desired_income_1500_1999==1
replace mindesinc=1000 if  desired_income_1000_1499==1
replace mindesinc=0 if  desired_income_0_999==1 |  no_desired_income==1

I know it is not very elegant, but I thought this would pick up the 
lowest acceptable income.

I was wondering if anyone would have any thoughts on a related question 
which is not specifically on Stata.
Apart from variable on minimum desired income my dataset (it is a 
dataset from a marriage agency) contains a host of variables on desired 
height, weight, marital status etc.
In particular I am now thinking about the variables of minimum desired 
height. They are different from the income variables in that for many 
people they describe a desired minimum and a maximum. So the data looks 
something like that:

input id  mindesheigh_below_150  mindesheigh_150_154   
mindesheight_155_159    mindesheight_160_164  
mindesheight_165_169                 ...    mindesheight_above_180
101 1 1 1 1 0
102 0 1 1 1 0
103 0 0 1 0 0
104 0 0 1 1 0
105 0 1 1 1 1
 
I am restructuring these variables into two: minimum desired height and 
maximum desired height. I am not sure how to treat the minimum desired 
height below 150 in the variable for minimum desired height and maximum 
desired height of above 180 in the variable for maximum desired height 
since I cannot really input metric values into them. There are few 
people who have such preferences so if I cannot find a good way of 
dealing with those I could consider simply dropping the observations in 
question, but I was wondering if anyone has a good idea or knows of a 
paper which came up with a good solution to such an issue?
I would be very grateful for advice,
sincerely yours,
Ekaterina


Jeph Herrin wrote:
> Well, on inspection, I see that her data have multiple
> tags per record, so that 1s are filled to the right after
> the first (left to right) 1; I was misled by
> Martin's faux dataset.  Her stated logic would then
> require:
>
>   gen min=mininc1*500+(mininc2-mininc1)*1000+(mininc3-mininc2)*1500
>
>
> Pending clarificaiton from Ekaterina about the _real_ structure
> of her data..
>
> Jeph
>
>
>
>
> Nick Cox wrote:
>> Good!
>> I didn't spell out that I feared that there are yet other variables in
>> what might be Ekaterina's _real_ problem, lurking behind her stated
>> problem, making a more general approach attractive too.
>> Nick n.j.cox@durham.ac.uk
>> Jeph Herrin
>>
>> Briefer yet:
>>
>>    gen min=mininc1*500+mininc2*1000+mininc3*1500
>>
>> which also traps the missings Nick cautions about.
>>
>> Nick Cox wrote:
>>> A variation on the same idea: 
>>> gen min = 500
>>> foreach v in 1000 1500 2000 {     replace min = `v' if mindesinc_`v' 
>>> == 1 }
>>>
>>> To be careful, check
>>> egen row = rowtotal(mindesinc*) assert row == 1
>>> Nick n.j.cox@durham.ac.uk
>>> Martin Weiss
>>>
>>> *reconstruct Ekaterina`s data
>>> clear*
>>> input id mindesinc_500_999 mindesinc_1000_1499  mindesinc_1500_1999
>>> 101 1 0 0 0
>>> 102 0 1 0 0
>>> 103 0 0 1 0
>>> 104 0 0 1 0
>>> 105 0 1 0 0
>>> end
>>> *construct the minimum desired income
>>> g mindesinc=500 if mindesinc_500_999
>>> replace mindesinc=1000 if mindesinc_1000_1499
>>> replace mindesinc=1500 if mindesinc_1500_1999
>>> l
>>>
>>> Ekaterina Hertog
>>> I am dealing with a dataset from a private company and so my data
>> often
>>> comes in rather strange format and I now came against the following
>>> problem:
>>>  
>>> I have a set of individuals who answered questions about desired
>> income.
>>> It looks as follows:
>>>
>>> Individ nmb | Min desired income 500 - 999 | 1000 - 1499 |  1500 -
>> 2000
>>> |
>>> 101             |                      0                        
>>> |         0          |           1          |
>>> 102             |                      0                        
>>> |         1          |           1          |
>>> 103             |                      0                        
>>> |         0          |           1          |
>>> 104             |                      1                        
>>> |         1          |           1          |
>>> 105             |                      0                        
>>> |         1          |           1          |
>>>
>>> Is there a way to automatically recode these binary minimum desired  
>>> income variables into a numerical variable which would state the
>> minimum
>>> acceptable figure for each individual?
>>> That is some routine which would check "Min desired income 500 - 
>>> 999" and if it equals 1 then would input 500 for the individual in 
>>> question
>>
>>> into a newly constructed variable "Minimum acceptable income" and move
>>
>>> on to the next person and if it equals 0 would look at the value of 
>>> "1000 - 1499" variable and if it equals 1 would input 1000 for that 
>>> person and move on to the next person and if it is 0 would look at
>> "1500
>>> - 2000" variable?
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


-- 
Ekaterina Hertog (née Korobtseva)
Career Development Fellow
Department of Sociology and Nissan Institute of Japanese Studies
University of Oxford

27 Winchester Road
Oxford
OX2 6NA
United Kingdom


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index