Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: String search


From   Simon Moore <simoncmoore@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: String search
Date   Wed, 13 Aug 2008 17:41:50 +0100

Dear Scott and Rafal,

Very many thanks for both suggestions - I'll play around and see what happens.

Regards
Simon



Scott Merryman wrote:

On Wed, Aug 13, 2008 at 8:17 AM, Simon Moore <simoncmoore@gmail.com> wrote:
Dear Statalist,

I have a string variable that contains values something like this:-

"outside the red lion pub"
"red lion"
"in the red lyon"

and so on.

I need to search this variable for names (e.g. "red lion") and would like to
do so in such a way that overcome the inevitable typo (e.g. "red lyon").

How about using Michael Blasnik's implementation of Donald Knuth's
SOUNDEX algorithm:

clear
input str25 var1
"outside the red lion pub"
"red lion"
"red   lion"
"redlion"
"in the red lyon"
"blue lion"
"red troll"
end

egen foo = soundex(var1) ,length(12)
gen tag = regexm(foo, "3.*5")
l

Scott
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index