Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Methods for linking inter-generational observations


From   Nathan Hutto <[email protected]>
To   [email protected]
Subject   st: Methods for linking inter-generational observations
Date   Sun, 24 Jan 2010 22:47:22 -0500

Hi all,

I have a dataset containing birth records in a particular region over
10 years. Many of the records are siblings or half-siblings. I need to
match siblings and half-siblings using a number of identifiers. I have
a unique maternal ID number for some of the sample; for the remainder
of the sample I'll have to match on name, geocode, blood type, etc.
Having never done this before, I'm not sure if I'm on the right track
or using the most accurate/efficient approach.

Here is the command I'm using to match on the maternal id:

duplicates tag maternal_id, gen(id_match)
gen parity = id_match if id_match<=10

For other match variables, this is the commands I'm using:

egen dobmaidfirst = group( dateofbirth maidenname firstname)
duplicates tag dobmaidfirst, gen(dobmaidfirstmatch)
replace parity = dobmaidfirstmatch if dobmaidfirstmatch<=4 &
[parity==0|parity==.]

Does anyone know if this is a common approach to matching individuals
existing within the same dataset. I know there are also probabilistic
methods, which I may try later. But first I need to use this approach.

Thank you,
Nathan

--
Doctoral  Student • Columbia University School of Social Work
Fellow • Columbia Population Research Center

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index