Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: String search


From   "Scott Merryman" <scott.merryman@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: String search
Date   Wed, 13 Aug 2008 10:12:24 -0500

On Wed, Aug 13, 2008 at 8:17 AM, Simon Moore <simoncmoore@gmail.com> wrote:
> Dear Statalist,
>
> I have a string variable that contains values something like this:-
>
> "outside the red lion pub"
> "red lion"
> "in the red lyon"
>
> and so on.
>
> I need to search this variable for names (e.g. "red lion") and would like to
> do so in such a way that overcome the inevitable typo (e.g. "red lyon").
>

How about using Michael Blasnik's implementation of Donald Knuth's
SOUNDEX algorithm:

clear
input str25 var1
"outside the red lion pub"
"red lion"
"red   lion"
"redlion"
"in the red lyon"
"blue lion"
"red troll"
end

egen foo = soundex(var1) ,length(12)
gen tag = regexm(foo, "3.*5")
l

Scott
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index