Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Re: Finding "near"-matches


From   "Allan Reese (Cefas)" <r.a.reese@cefas.co.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Re: Finding "near"-matches
Date   Mon, 31 Oct 2005 10:33:30 -0000

Earlier versions of dBase had a fuzzy string match function which was very useful.  Our university email system in the 1980s also used fuzzy matching, so if you emailed "Alan Reese" you would get a reply on the lines, "Name not recognised, nearest matches are Allan Reese, Alan Rees".  I assume this feature was killed because it was an open door for spammers.  On the other hand, providing a fuzzy string match function in Stata could be a useful addition.
Allan 

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Aaron
Sent: 28 October 2005 16:49
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Re: Finding "near"-matches


The topic gets more and more interesting. I often need to match
'fuzzily' the names from two databases that have very minor
differences. here are some examples:

Ford Co.
Ford Corporation
Ford Inc. (just an example)

or

XYZ Tech
XYZ Technology Inc.

Can you recommend some programs to generate a list of 'fuzzy' or
'near' matches for a name (one or more than one alphanumeric
characters)? Even if a program provides the three possible matches for
the name 'Ford', that's still better than hand-checking.

Aaron




***********************************************************************************
This email and any attachments are intended for the named recipient only.  Its unauthorised use, distribution, disclosure, storage or copying is not permitted.  If you have received it in error, please destroy all copies and notify the sender.  In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent.  All emails may be subject to monitoring.
***********************************************************************************


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index