# Re: st: Data Processing; How to swap around observations?

 From wgould@stata.com (William Gould, Stata) To statalist@hsphsun2.harvard.edu Subject Re: st: Data Processing; How to swap around observations? Date Fri, 07 Nov 2003 08:28:07 -0600

```Altay Mussurov <aam00@aber.ac.uk> wrote,

> I have the following data set [...]  on the level of schooling (s) for every
> memeber (memid) of the household unit [...]  (hhid).  [...] I am looking at
> married couples. There are multiple couples who reside in the same hhid. The
> variable "pairs" identifies how many couples live within the same hhid.
> [...] my "sex" and "memid" variables are not sorted and they can't be
> sorted. This is an inherent nature of the data.  Head of the household can
> have any "memid" and so can his/her spouse.  Hence, my data is organised in
> the following manner:
>
>       hhid   memid      sex    s   pairs   smale   sfemale
>          1       1   female    8       1       0         8
>          1       2     male    3       1       3         0
>          2       1     male   12       2      12         0
>          2       2   female   10       2       0        10
>          2       4   female   15       2       0        15
>          2       8     male   10       2      10         0
>
> [...]  I have to replace schooling of the male (smale) with the schooling of
> the female (sfemale) [...]
>
> This is the expected format:
>
>       hhid   memid      sex    s   pairs   smale*  sfemale
>          1       1   female    8       1       3         0
>          1       2     male    3       1       0         8
>          2       1     male   12       2      10         0
>          2       2   female   10       2       0        12
>          2       4   female   15       2      15         0
>          2       8     male   10       2       0        10

I am not certain that I yet understand the structure of these data.
Taking hhid==2,

hhid   memid      sex    s   pairs   smale   sfemale

2       1     male   12       2      12         0
2       2   female   10       2       0        10

2       4   female   15       2       0        15
2       8     male   10       2      10         0

I am guessing the order of the dataset indicates who is married to whom.
Thus, the first pair with the hhid form a couple, and do the second pair.
I will proceed under that assumption.

Step 1:  Sort the dataset by hhid without disturbing the within-hhid order
--------------------------------------------------------------------------

. gen seq = _n
. sort hhid seq

Step 2:  Add a couple id variable within hhid
---------------------------------------------

. by hhid: gen couple = int( (_n-1)/2 ) + 1

Step 3:  gen a variable that orders within couple
-------------------------------------------------

. sort hhid couple seq
. by hhid couple:  gen order = _n
. drop seq

At this point, we have the following dataset:

hhid  couple  order   memid      sex    s   pairs   smale   sfemale

1       1      1       1   female    8       1       0         8
1       1      2       2     male    3       1       3         0

2       1      1       1     male   12       2      12         0
2       1      2       2   female   10       2       0        10
2       2      1       4   female   15       2       0        15
2       2      2       8     male   10       2      10         0

Now we can create the desired result:

. sort hhid couple order
. by hhid couple:  gen newsmale   = smale[2]   if _n==1
. by hhid couple:  gen newsfemale = sfemale[2] if _n==1

. by hhid couple:  replace newsmale   = smale[1]    if _n==2
. by hhid couple:  replace newsfemale = sfemale[1]  if _n==2

At this point, we should list some of the data to make sure the result is as
desired, and then

. drop smale sfemale
. rename newsmale smale
. rename newsfemale sfemale

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```