# Re[2]: st: creating a variable using -if-? programming needed?

```Dear Nick,

for me your solution with "[3 - _n]" is the most elegant and most
easily understood.

But how to handle the situation if  I have data on all the household
members, and let's say, I want to keep them all. I have two variables
identifying a couple within each household: "spouse1" and "spouse2".. so, if
there is a household composed of 5 members, and N1 and N3
are married, then spouse1=1 and spouse2=3.

I would do something like

#delimit;
bysort  site  censusd family : gen partner_income = personal[spouse1]
if person==spouse2;
bysort  site  censusd family :   replace partner_income =
personal[spouse2] if person==spouse1;

it works fine if both partners are present in the sample, but if only
one is in, the partner_income variable is replaced sometimes with the individual
income of the individual and sometimes with the missing value.

What could be wrong with the code?

Eka

Thursday, May 15, 2008, 4:24:34 PM, you wrote:

> Using the same assumption of a Noah's ark situation in which everyone is
> paired off, with no dependents, in-laws, etc., there are other minor
> variations on Scott's theme.

> bysort site family: gen partner_income = cond(_n == 1, personal[2],
> personal[1])

> bysort site family: gen partner_income = personal[3 - _n]

> I am fond of the second. If it looks puzzling, just go through the two
> cases. If _n is 1, then 3 - _n is 2, and vice versa.

> The key underlying principle, if it is not familiar, is that under -by:-
> _n is interpreted within groups, not within the entire dataset. There is
> a leisurely tutorial
> at

> <http://www.stata-journal.com/sjpdf.html?articlenum=pr0004>

> N.B. this is in the public domain.

> Nick
> n.j.cox@durham.ac.uk

> Scott Merryman

> What defines a partner?  Someone within the same two-person family?

> Perhaps something like this:

> clear
> input site family person personal_labor_income
>  1       1            1        2000
>  1       1            2        2300
>  1       2            1        200
>  1       2            3        3000
>  2       10           4        3400
>  2       10           5        3500
> end

> bysort site family : gen partner_income = cond(_n ==1,
> personal[_n+1],personal[_n-1])

> Ekaterina Selezneva

>> I have a dataset with some information on a sample of married
>>  couples. For identifying a single person, one needs to know a
>>  "site"-number, "family"-number withing the site, and then the
>>  "person"-number within the family. As this is a subsample of some
>>  bigger dataset, so not all sites/families/persons are presented in
>>  it. Let's say, something like:
>>
>>  site    family       person   personal_labor_income
>>  1       1            1        2000
>>  1       1            2        2300
>>  1       2            1        200
>>  1       2            3        3000
>>  2       10           4        3400
>>  2       10           5        3500
>>
>>  THE PROBLEM: I need to create a variable containing the "personal
>>  labor income" of partner.
>>
>>  Unfortunately, I've spent a day, and havn't
>>  succeded to solve this seemingly simple problem. I will be grateful
> for
>>  any hints.

```