[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: st: RE: AW: how to reconstruct a minimum acceptable income from a set of binary variables?

From   Ekaterina Hertog <>
Subject   Re: AW: st: RE: AW: how to reconstruct a minimum acceptable income from a set of binary variables?
Date   Mon, 02 Mar 2009 17:07:11 +0000

Dear Martin,
Thank you again for the advice!
Sorry I am a bit confused where can I omit "==1"? The only place where I see "==1" is in these lines:

replace mindesinc=7000 if  desired_income_above_7000==1

Would it work if I write:
replace mindesinc=7000 if  desired_income_above_7000

Or did you mean I should omit the "==1" bit when writing to the list?
Sorry for a silly question,
sincerely yours,

Martin Weiss wrote:
On your method, I would say this is the best way for a beginner to think
about such a problem and see how logically values get replaced based on some
condition. You can omit the "==1" part, btw.

On the last question, M. Buis gave a speech in Chicago which touched upon
this subject. See
and exectute -view mlbsnasug08.smcl- to start the presentation...


-----Ursprüngliche Nachricht-----
[] Im Auftrag von Ekaterina
Gesendet: Montag, 2. März 2009 17:50
Betreff: Re: st: RE: AW: how to reconstruct a minimum acceptable income from
a set of binary variables?

Dear all,
Thank you very much for all the comments.

The real structure of my data is as I stated originally, rather than as Martin described it:
input id mindesinc_500_999 mindesinc_1000_1499  mindesinc_1500_1999
101 1 1 1 1
102 0 1 1 1
103 0 0 1 1
104 0 0 1 1
105 0 1 1 1

Basically once the person finds a certain income acceptable he or she finds every income above acceptable too and puts 1s, rather than 0s. At the moment I used one of the earlier advices I got with a bit of a modification
gen mindesinc=0
replace mindesinc=7000 if  desired_income_above_7000==1
replace mindesinc=6000 if  desired_income_6000_6999==1
replace mindesinc=5000 if  desired_income_5000_5999==1
replace mindesinc=4500 if  desired_income_4500_4999==1
replace mindesinc=4000 if  desired_income_4000_4499==1
replace mindesinc=3500 if  desired_income_3500_3999==1
replace mindesinc=3000 if   desired_income_3000_3499==1
replace mindesinc=2500 if   desired_income_2500_2999==1
replace mindesinc=2000 if  desired_income_2000_2499==1
replace mindesinc=1500 if  desired_income_1500_1999==1
replace mindesinc=1000 if  desired_income_1000_1499==1
replace mindesinc=0 if  desired_income_0_999==1 |  no_desired_income==1

I know it is not very elegant, but I thought this would pick up the lowest acceptable income.

I was wondering if anyone would have any thoughts on a related question which is not specifically on Stata. Apart from variable on minimum desired income my dataset (it is a dataset from a marriage agency) contains a host of variables on desired height, weight, marital status etc. In particular I am now thinking about the variables of minimum desired height. They are different from the income variables in that for many people they describe a desired minimum and a maximum. So the data looks something like that:

input id mindesheigh_below_150 mindesheigh_150_154 mindesheight_155_159 mindesheight_160_164 mindesheight_165_169 ... mindesheight_above_180
101 1 1 1 1 0
102 0 1 1 1 0
103 0 0 1 0 0
104 0 0 1 1 0
105 0 1 1 1 1
I am restructuring these variables into two: minimum desired height and maximum desired height. I am not sure how to treat the minimum desired height below 150 in the variable for minimum desired height and maximum desired height of above 180 in the variable for maximum desired height since I cannot really input metric values into them. There are few people who have such preferences so if I cannot find a good way of dealing with those I could consider simply dropping the observations in question, but I was wondering if anyone has a good idea or knows of a paper which came up with a good solution to such an issue?
I would be very grateful for advice,
sincerely yours,

Jeph Herrin wrote:
Well, on inspection, I see that her data have multiple
tags per record, so that 1s are filled to the right after
the first (left to right) 1; I was misled by
Martin's faux dataset.  Her stated logic would then

  gen min=mininc1*500+(mininc2-mininc1)*1000+(mininc3-mininc2)*1500

Pending clarificaiton from Ekaterina about the _real_ structure
of her data..


Nick Cox wrote:
I didn't spell out that I feared that there are yet other variables in
what might be Ekaterina's _real_ problem, lurking behind her stated
problem, making a more general approach attractive too.
Jeph Herrin

Briefer yet:

   gen min=mininc1*500+mininc2*1000+mininc3*1500

which also traps the missings Nick cautions about.

Nick Cox wrote:
A variation on the same idea: gen min = 500 foreach v in 1000 1500 2000 { replace min = `v' if mindesinc_`v' == 1 }

To be careful, check
egen row = rowtotal(mindesinc*) assert row == 1
Martin Weiss

*reconstruct Ekaterina`s data
input id mindesinc_500_999 mindesinc_1000_1499  mindesinc_1500_1999
101 1 0 0 0
102 0 1 0 0
103 0 0 1 0
104 0 0 1 0
105 0 1 0 0
*construct the minimum desired income
g mindesinc=500 if mindesinc_500_999
replace mindesinc=1000 if mindesinc_1000_1499
replace mindesinc=1500 if mindesinc_1500_1999

Ekaterina Hertog
I am dealing with a dataset from a private company and so my data
comes in rather strange format and I now came against the following
I have a set of individuals who answered questions about desired
It looks as follows:

Individ nmb | Min desired income 500 - 999 | 1000 - 1499 |  1500 -
101 | 0 | 0 | 1 | 102 | 0 | 1 | 1 | 103 | 0 | 0 | 1 | 104 | 1 | 1 | 1 | 105 | 0 | 1 | 1 |

Is there a way to automatically recode these binary minimum desired income variables into a numerical variable which would state the
acceptable figure for each individual?
That is some routine which would check "Min desired income 500 - 999" and if it equals 1 then would input 500 for the individual in question into a newly constructed variable "Minimum acceptable income" and move on to the next person and if it equals 0 would look at the value of "1000 - 1499" variable and if it equals 1 would input 1000 for that person and move on to the next person and if it is 0 would look at
- 2000" variable?
*   For searches and help try:

*   For searches and help try:

Ekaterina Hertog (née Korobtseva)
Career Development Fellow
Department of Sociology and Nissan Institute of Japanese Studies
University of Oxford

27 Winchester Road
United Kingdom

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index