Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: matching strings on words
From 
 
Eric Booth <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: matching strings on words 
Date 
 
Tue, 30 Mar 2010 14:05:23 -0500 
>
Check out -reclink- from SSC.
One example:
http://www.stata.com/statalist/archive/2009-12/msg00016.html
~ Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
On Mar 30, 2010, at 2:00 PM, Jeph Herrin wrote:
> 
> I'm not sure what to call this - if I did, I might have
> better luck with my searches for a utility. Basically,
> I want to do something similar to the utility -nmatch-
> which matches first and last names, but I have more than
> two words per record.
> 
> The problem: I have two files with lists of hospital names.
> Hospital names tend to consist of multiple words, that get
> used to different extent; the same hospital might be listed
> as:
> 
> st joseph's
> st joseph's memorial
> st joseph's memorial hospital
> st joseph's memorial hospital of danbury
> 
> etc. (There is also a lot variation on eg "Saint vs "St." and
> "Memorial" vs "memorial", but I have trapped most of those
> already.)
> 
> What I'd like to do is match these on "words", and generate
> a _merge variable which indicates how many words match vs
> how many words there are. Then I (or some unlucky grad student)
> can trawl through the matches and decide which ones are the
> same hospital.
> 
> I can see how to write a program to do such a thing, but am hoping
> there is already a solution out there that I overlooked?
> 
> thanks,
> Jeph
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/