I think you may want to check out the reclink.ado that I wrote and you
can find on SSC. It uses a bigram string comparator to rank agreement
between strings. reclink would be especially helpful if you have
other variables that may be useful for the match -- like gender, age
or location. Even without such variables, you may benefit from
creating derived variables that can be added to the reclink matching
process -- including a soundex of each name (first and last) and
initials that could be used for blocking in an initial run of reclnk
to identify the better matches more quickly.
Michael Blasnik
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/