[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Data Processing; How to swap around observations?

From	[email protected] (William Gould, Stata)
To	[email protected]
Subject	Re: st: Data Processing; How to swap around observations?
Date	Fri, 07 Nov 2003 08:28:07 -0600

Altay Mussurov <[email protected]> wrote, 

> I have the following data set [...]  on the level of schooling (s) for every
> memeber (memid) of the household unit [...]  (hhid).  [...] I am looking at
> married couples. There are multiple couples who reside in the same hhid. The
> variable "pairs" identifies how many couples live within the same hhid.
> [...] my "sex" and "memid" variables are not sorted and they can't be
> sorted. This is an inherent nature of the data.  Head of the household can
> have any "memid" and so can his/her spouse.  Hence, my data is organised in
> the following manner:
>
>       hhid   memid      sex    s   pairs   smale   sfemale
>          1       1   female    8       1       0         8
>          1       2     male    3       1       3         0
>          2       1     male   12       2      12         0
>          2       2   female   10       2       0        10
>          2       4   female   15       2       0        15
>          2       8     male   10       2      10         0
>
> [...]  I have to replace schooling of the male (smale) with the schooling of
> the female (sfemale) [...]
>
> This is the expected format:
> 
>       hhid   memid      sex    s   pairs   smale*  sfemale
>          1       1   female    8       1       3         0
>          1       2     male    3       1       0         8
>          2       1     male   12       2      10         0
>          2       2   female   10       2       0        12
>          2       4   female   15       2      15         0
>          2       8     male   10       2       0        10

I am not certain that I yet understand the structure of these data.
Taking hhid==2, 

        hhid   memid      sex    s   pairs   smale   sfemale

           2       1     male   12       2      12         0
           2       2   female   10       2       0        10

           2       4   female   15       2       0        15
           2       8     male   10       2      10         0

I am guessing the order of the dataset indicates who is married to whom.
Thus, the first pair with the hhid form a couple, and do the second pair.
I will proceed under that assumption.


Step 1:  Sort the dataset by hhid without disturbing the within-hhid order
--------------------------------------------------------------------------

        . gen seq = _n
        . sort hhid seq


Step 2:  Add a couple id variable within hhid
---------------------------------------------

        . by hhid: gen couple = int( (_n-1)/2 ) + 1


Step 3:  gen a variable that orders within couple
-------------------------------------------------

        . sort hhid couple seq
        . by hhid couple:  gen order = _n
        . drop seq


At this point, we have the following dataset:

        hhid  couple  order   memid      sex    s   pairs   smale   sfemale

           1       1      1       1   female    8       1       0         8
           1       1      2       2     male    3       1       3         0
 
           2       1      1       1     male   12       2      12         0
           2       1      2       2   female   10       2       0        10
           2       2      1       4   female   15       2       0        15
           2       2      2       8     male   10       2      10         0

Now we can create the desired result:

        . sort hhid couple order 
        . by hhid couple:  gen newsmale   = smale[2]   if _n==1
        . by hhid couple:  gen newsfemale = sfemale[2] if _n==1

        . by hhid couple:  replace newsmale   = smale[1]    if _n==2
        . by hhid couple:  replace newsfemale = sfemale[1]  if _n==2

At this point, we should list some of the data to make sure the result is as
desired, and then

        . drop smale sfemale
        . rename newsmale smale
        . rename newsfemale sfemale

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: testing coefficients across samples that are nested
  - From: Lisa Powell <[email protected]>

Prev by Date: st: Re: Hausman test for IIA, anyone??
Next by Date: st: more help with mtest needed
Previous by thread: Re: st: Data Processing; How to swap around observations?
Next by thread: st: testing coefficients across samples that are nested
Index(es):
- Date
- Thread