Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Matching Names


From   "Kieran McCaul" <kamccaul@meddent.uwa.edu.au>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Matching Names
Date   Fri, 8 Aug 2008 06:10:15 +0800

This is a big problem.

You might want to investigate using soundex to help with matching the misspelt names but, depending on the version of soundex that you use, it may not be particularly useful.

Michael Blasnik wrote an egen function to implement a soundex algorithm a while ago for Stata 7.
http://ideas.repec.org/c/boc/bocode/s420901.html

You could try that.




______________________________________________
Kieran McCaul MPH PhD
WA Centre for Health & Ageing (M573)
University of Western Australia
Level 6, Ainslie House
48 Murray St
Perth 6000
Phone: (08) 9224-2140
Phone: -61-8-9224-2140
email: kamccaul@meddent.uwa.edu.au
http://myprofile.cos.com/mccaul 
_______________________________________________


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Max Perez Leon
Sent: Friday, 8 August 2008 5:03 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: Matching Names


Hello statalist users,

I am having a big problem trying to merge to datasets with names. The problem is
that there are tons of typos in both datasets. Examples bellow:

DATASET 1: --------------------- DATASET 2:

NAMES--------------------------- NAMES

LUIS PÉREZ --------------------- LUIS P´REZ
WILLIAM SMITH ------------------ WILLIAM SMITHSS
JORGE F. CHOCAN ---------------- JORGE F CHOCANOS
P. BROWN ----------------------- PAUL BROWN
ENRIQUETA GAUDENCIA------------- ENRIQUETA G

I could do it by hand but I have 52568 obs and more to come. I am trying to
establish a method using regular expressions so that I can merge correctly the
datasets.
Any help will be very much appreciated, 

Thanks for your time,
Max Perez Leon
PUCP-IEP


         

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index