[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: Methods for linking inter-generational observations

From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   st: AW: Methods for linking inter-generational observations
Date   Mon, 25 Jan 2010 09:25:46 +0100


Generally speaking, you may want to browse the FAQ at . Most of the time, you will
end up not using -duplicates-, but -bysort- to conduct such operations. See

If you provide an excerpt of your data, further help may be forthcoming.


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Nathan Hutto
Gesendet: Montag, 25. Januar 2010 04:47
An: [email protected]
Betreff: st: Methods for linking inter-generational observations

Hi all,

I have a dataset containing birth records in a particular region over
10 years. Many of the records are siblings or half-siblings. I need to
match siblings and half-siblings using a number of identifiers. I have
a unique maternal ID number for some of the sample; for the remainder
of the sample I'll have to match on name, geocode, blood type, etc.
Having never done this before, I'm not sure if I'm on the right track
or using the most accurate/efficient approach.

Here is the command I'm using to match on the maternal id:

duplicates tag maternal_id, gen(id_match)
gen parity = id_match if id_match<=10

For other match variables, this is the commands I'm using:

egen dobmaidfirst = group( dateofbirth maidenname firstname)
duplicates tag dobmaidfirst, gen(dobmaidfirstmatch)
replace parity = dobmaidfirstmatch if dobmaidfirstmatch<=4 &

Does anyone know if this is a common approach to matching individuals
existing within the same dataset. I know there are also probabilistic
methods, which I may try later. But first I need to use this approach.

Thank you,

Doctoral  Student ? Columbia University School of Social Work
Fellow ? Columbia Population Research Center

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index