Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to identify siblings

From	Maarten buis <[email protected]>
To	[email protected]
Subject	Re: st: How to identify siblings
Date	Wed, 16 Jun 2010 02:58:02 -0700 (PDT)

--- On Wed, 16/6/10, speciale speciale wrote:
> I have a dataset containing information on a population
> born in 1980-1990, and some of the persons in the dataset are
> siblings. I am interested in doing some descriptive analyses, where I:
> 
> - Compare full brothers with an age difference of maximum five years
> 
> - Compare full sisters with an age difference…
> 
> - Compare half brothers with an age difference..
> 
> - Compare full sister with an age difference…
> 
> I only wish to compare siblings who are discordant on
> marital status in order to find out if this has any
> impact on later health outcomes.
>
> The dataset includes the following information:
> 
> Person ID
> 
> Gender
> 
> Mother ID
> 
> Father ID
> 
> Age/day of birth of the person
> 
> Marital status
> 
> Different health indicators
> 
> Etc..
> 
> 
> So my questions are:
> 
> -How to identify who in the dataset are half siblings /
> full siblings? 
>
> -How is it possible to handle the fact that one person
> might have both
> several half and full siblings - and make pairwise
> comparisons of each
> of these pairs (but only if they are  of the same gender
> and have a
> age-difference of maximum 5 years)?

I would stick with families consisting of only 2 children,
this makes things easier for you (and as the "official" reason
you can say that this way you control for the possible 
confounding effect of family size)

You can select full siblings by making sure they have the
same father and mother. The line -egen famid = group(fid mid)-
creates a unique identifier for people who share the same
parents.

If you want to compare sibs it is often convenient to 
-reshape- your data so that you have two variables: in
this case one for the first born and one for the second
born. In your case you probably want to have marital
status as your -j()-

*----------------- begin example -------------------
drop _all
input id byr fid mid x
1  1977 1 1 1 
2  1979 2 1 5 
3  1982 1 1 4
4  1966 3 2 6 
5  1671 3 2 4 
6  1973 4 3 2
7  1695 5 4 3
8  1967 5 4 4
9  1969 5 4 7
end

egen famid = group(fid mid)
bys famid (byr) : gen birthorder = _n
bys famid (byr) : gen size = _N

keep if size == 2

reshape wide x id fid mid byr, i(famid) j(birthorder)
*--------------------- end example ------------------
(For more on examples I sent to the Statalist see: 
http://www.maartenbuis.nl/example_faq )


--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------



      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: How to identify siblings
  - From: "Nick Cox" <[email protected]>

References:
- st: How to identify siblings
  - From: speciale speciale <[email protected]>

Prev by Date: RE: st: RE: RE: A Question on Selecting a Sample from a Panel Data Set
Next by Date: Re: AW: st: RE: Deleting similar observations
Previous by thread: st: How to identify siblings
Next by thread: RE: st: How to identify siblings
Index(es):
- Date
- Thread