# AW: AW: st: RE: AW: how to reconstruct a minimum acceptable income from a set of binary variables?

 From "Martin Weiss" To Subject AW: AW: st: RE: AW: how to reconstruct a minimum acceptable income from a set of binary variables? Date Mon, 2 Mar 2009 18:20:22 +0100

```<>

Well, you can check it out yourself, or look it up in
http://www.stata-journal.com/article.html?article=pr0029, section 3. In a
nutshell, -if- without  a logical comparison is equivalent to "different
from zero". In the case of your dummies, that means: "1"..

*************
sysuse auto, clear

*2 equivalent statements
summ mpg if foreign
summ mpg if foreign==1

*2 equivalent statements
summ mpg if !foreign
summ mpg if foreign==0
*************

HTH
Martin

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Ekaterina
Hertog
Gesendet: Montag, 2. März 2009 18:07
An: statalist@hsphsun2.harvard.edu
Betreff: Re: AW: st: RE: AW: how to reconstruct a minimum acceptable income
from a set of binary variables?

Dear Martin,
Thank you again for the advice!
Sorry I am a bit confused where can I omit "==1"? The only place where I
see "==1" is in these lines:

replace mindesinc=7000 if  desired_income_above_7000==1

Would it work if I write:
replace mindesinc=7000 if  desired_income_above_7000

Or did you mean I should omit the "==1" bit when writing to the list?
Sorry for a silly question,
sincerely yours,
Ekaterina

Martin Weiss wrote:
> <>
>
> On your method, I would say this is the best way for a beginner to think
> about such a problem and see how logically values get replaced based on
some
> condition. You can omit the "==1" part, btw.
>
> On the last question, M. Buis gave a speech in Chicago which touched upon
> this subject. See
http://www.stata.com/meeting/snasug08/buis_MLBsimulate.zip
> and exectute -view mlbsnasug08.smcl- to start the presentation...
>
>
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Ekaterina
> Hertog
> Gesendet: Montag, 2. März 2009 17:50
> An: statalist@hsphsun2.harvard.edu
> Betreff: Re: st: RE: AW: how to reconstruct a minimum acceptable income
from
> a set of binary variables?
>
> Dear all,
> Thank you very much for all the comments.
>
> The real structure of my data is as I stated originally, rather than as
> Martin described it:
> e.g.
> input id mindesinc_500_999 mindesinc_1000_1499  mindesinc_1500_1999
> 101 1 1 1 1
> 102 0 1 1 1
> 103 0 0 1 1
> 104 0 0 1 1
> 105 0 1 1 1
>
> Basically once the person finds a certain income acceptable he or she
> finds every income above acceptable too and puts 1s, rather than 0s.
> At the moment I used one of the earlier advices I got with a bit of a
> modification
> gen mindesinc=0
> replace mindesinc=7000 if  desired_income_above_7000==1
> replace mindesinc=6000 if  desired_income_6000_6999==1
> replace mindesinc=5000 if  desired_income_5000_5999==1
> replace mindesinc=4500 if  desired_income_4500_4999==1
> replace mindesinc=4000 if  desired_income_4000_4499==1
> replace mindesinc=3500 if  desired_income_3500_3999==1
> replace mindesinc=3000 if   desired_income_3000_3499==1
> replace mindesinc=2500 if   desired_income_2500_2999==1
> replace mindesinc=2000 if  desired_income_2000_2499==1
> replace mindesinc=1500 if  desired_income_1500_1999==1
> replace mindesinc=1000 if  desired_income_1000_1499==1
> replace mindesinc=0 if  desired_income_0_999==1 |  no_desired_income==1
>
> I know it is not very elegant, but I thought this would pick up the
> lowest acceptable income.
>
> I was wondering if anyone would have any thoughts on a related question
> which is not specifically on Stata.
> Apart from variable on minimum desired income my dataset (it is a
> dataset from a marriage agency) contains a host of variables on desired
> height, weight, marital status etc.
> In particular I am now thinking about the variables of minimum desired
> height. They are different from the income variables in that for many
> people they describe a desired minimum and a maximum. So the data looks
> something like that:
>
> input id  mindesheigh_below_150  mindesheigh_150_154
> mindesheight_155_159    mindesheight_160_164
> mindesheight_165_169                 ...    mindesheight_above_180
> 101 1 1 1 1 0
> 102 0 1 1 1 0
> 103 0 0 1 0 0
> 104 0 0 1 1 0
> 105 0 1 1 1 1
>
> I am restructuring these variables into two: minimum desired height and
> maximum desired height. I am not sure how to treat the minimum desired
> height below 150 in the variable for minimum desired height and maximum
> desired height of above 180 in the variable for maximum desired height
> since I cannot really input metric values into them. There are few
> people who have such preferences so if I cannot find a good way of
> dealing with those I could consider simply dropping the observations in
> question, but I was wondering if anyone has a good idea or knows of a
> paper which came up with a good solution to such an issue?
> I would be very grateful for advice,
> sincerely yours,
> Ekaterina
>
>
> Jeph Herrin wrote:
>
>> Well, on inspection, I see that her data have multiple
>> tags per record, so that 1s are filled to the right after
>> the first (left to right) 1; I was misled by
>> Martin's faux dataset.  Her stated logic would then
>> require:
>>
>>   gen min=mininc1*500+(mininc2-mininc1)*1000+(mininc3-mininc2)*1500
>>
>>
>> Pending clarificaiton from Ekaterina about the _real_ structure
>> of her data..
>>
>> Jeph
>>
>>
>>
>>
>> Nick Cox wrote:
>>
>>> Good!
>>> I didn't spell out that I feared that there are yet other variables in
>>> what might be Ekaterina's _real_ problem, lurking behind her stated
>>> problem, making a more general approach attractive too.
>>> Nick n.j.cox@durham.ac.uk
>>> Jeph Herrin
>>>
>>> Briefer yet:
>>>
>>>    gen min=mininc1*500+mininc2*1000+mininc3*1500
>>>
>>> which also traps the missings Nick cautions about.
>>>
>>> Nick Cox wrote:
>>>
>>>> A variation on the same idea:
>>>> gen min = 500
>>>> foreach v in 1000 1500 2000 {     replace min = `v' if mindesinc_`v'
>>>> == 1 }
>>>>
>>>> To be careful, check
>>>> egen row = rowtotal(mindesinc*) assert row == 1
>>>> Nick n.j.cox@durham.ac.uk
>>>> Martin Weiss
>>>>
>>>> *reconstruct Ekaterina`s data
>>>> clear*
>>>> input id mindesinc_500_999 mindesinc_1000_1499  mindesinc_1500_1999
>>>> 101 1 0 0 0
>>>> 102 0 1 0 0
>>>> 103 0 0 1 0
>>>> 104 0 0 1 0
>>>> 105 0 1 0 0
>>>> end
>>>> *construct the minimum desired income
>>>> g mindesinc=500 if mindesinc_500_999
>>>> replace mindesinc=1000 if mindesinc_1000_1499
>>>> replace mindesinc=1500 if mindesinc_1500_1999
>>>> l
>>>>
>>>> Ekaterina Hertog
>>>> I am dealing with a dataset from a private company and so my data
>>>>
>>> often
>>>
>>>> comes in rather strange format and I now came against the following
>>>> problem:
>>>>
>>>> I have a set of individuals who answered questions about desired
>>>>
>>> income.
>>>
>>>> It looks as follows:
>>>>
>>>> Individ nmb | Min desired income 500 - 999 | 1000 - 1499 |  1500 -
>>>>
>>> 2000
>>>
>>>> |
>>>> 101             |                      0
>>>> |         0          |           1          |
>>>> 102             |                      0
>>>> |         1          |           1          |
>>>> 103             |                      0
>>>> |         0          |           1          |
>>>> 104             |                      1
>>>> |         1          |           1          |
>>>> 105             |                      0
>>>> |         1          |           1          |
>>>>
>>>> Is there a way to automatically recode these binary minimum desired
>>>> income variables into a numerical variable which would state the
>>>>
>>> minimum
>>>
>>>> acceptable figure for each individual?
>>>> That is some routine which would check "Min desired income 500 -
>>>> 999" and if it equals 1 then would input 500 for the individual in
>>>> question
>>>>
>>>> into a newly constructed variable "Minimum acceptable income" and move
>>>>
>>>> on to the next person and if it equals 0 would look at the value of
>>>> "1000 - 1499" variable and if it equals 1 would input 1000 for that
>>>> person and move on to the next person and if it is 0 would look at
>>>>
>>> "1500
>>>
>>>> - 2000" variable?
>>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
>
>

--
Ekaterina Hertog (née Korobtseva)
Career Development Fellow
Department of Sociology and Nissan Institute of Japanese Studies
University of Oxford

Oxford
OX2 6NA
United Kingdom

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```