Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: identifying last name/first name inversions inadministrative datasets


From   Rafal Raciborski <[email protected]>
To   [email protected]
Subject   Re: st: identifying last name/first name inversions inadministrative datasets
Date   Wed, 12 Oct 2005 09:36:04 -0400

Pierre,

Why not something like:


. list

    +----------------------------+
    | id        var1        var2 |
    |----------------------------|
 1. |  1   Alexander       Knuth |
 2. |  2       Knuth   Alexander |
 3. |  3   Alexander       Brown |
 4. |  4        John       Brown |
 5. |  5       Brown        John |
    |----------------------------|
 6. |  6   Alexander       Knuth |
    +----------------------------+

. gen fullname = var1 + " " + var2

. list

    +----------------------------------------------+
    | id        var1        var2          fullname |
    |----------------------------------------------|
 1. |  1   Alexander       Knuth   Alexander Knuth |
 2. |  2       Knuth   Alexander   Knuth Alexander |
 3. |  3   Alexander       Brown   Alexander Brown |
 4. |  4        John       Brown        John Brown |
 5. |  5       Brown        John        Brown John |
    |----------------------------------------------|
 6. |  6   Alexander       Knuth   Alexander Knuth |
    +----------------------------------------------+

. gen match = 0

. forvalues i = 1/6 {
 2.         replace match = `i' if regexm(fullname[`i'], var1) &
regexm(fullname[`i'], var2)
 3. }

<snip>

. sort match

. list

    +------------------------------------------------------+
    | id        var1        var2          fullname   match |
    |------------------------------------------------------|
 1. |  3   Alexander       Brown   Alexander Brown       3 |
 2. |  4        John       Brown        John Brown       5 |
 3. |  5       Brown        John        Brown John       5 |
 4. |  2       Knuth   Alexander   Knuth Alexander       6 |
 5. |  1   Alexander       Knuth   Alexander Knuth       6 |
    |------------------------------------------------------|
 6. |  6   Alexander       Knuth   Alexander Knuth       6 |
    +------------------------------------------------------+


Now the same person has the same 'match' number.

rafal


================
Rafal Raciborski
Graduate student
Department of Political Science
Emory University
301 Tarbutton Hall
1555 Dickey Drive
Atlanta, GA 30322
404-378-9826 (home)
[email protected]
http://www.roofoos.net/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index