[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: "sounds like" function in Stata

From	Thomas Speidel <[email protected]>
To	[email protected]
Subject	Re: st: "sounds like" function in Stata
Date	Tue, 18 Aug 2009 13:07:32 -0600

I believe Stata 11 has now a built-in function for this:
http://www.stata.com/help.cgi?soundex

Quoting Tirthankar Chakravarty <[email protected]> Tue18 Aug 13:01:47 2009:

<>
ssc install _gsoundex

due to Michael Blasnik. Or, in Stata 11:
http://www.stata.com/help.cgi?string+functions

T

On Tue, Aug 18, 2009 at 7:55 PM, Dalhia<[email protected]> wrote:

Hi, Is there some kind of "sounds like" function in Stata? I have alist of companies but the names are sometimes a little different.Example AOL Time Warner also appears as AOL, Time Warner, and TimeWarner Inc. I need a method to figure out that all these are thesame entity, and none of the string functions in Stata seem to dowhat I want. Do any of you have any suggestions. Here is how thedata looks like:


Name

AOL
AOL Time Warner
Time Warner Inc
Microsoft
Microsoft Inc
Microsft

Ideally, what I would like is some way to indicate which names aresimilar. For example:


Name, Similarity

AOL, 1
AOL Time Warner, 1
Time Warner Inc, 1
Microsoft, 2
Microsoft Inc, 2
Microsft, 2

Any help will be much appreciated.
Thanks
Dalhia

--- On Fri, 6/5/09, Nick Cox <[email protected]> wrote:

From: Nick Cox <[email protected]>
Subject: st: RE: appling string functions across observations
To: [email protected]
Date: Friday, June 5, 2009, 3:00 PM
Check out -fndmtch2- or -fndmtch-
from SSC. At first sight they don't
address this problem, but there are at least two ways
forward:

It sounds as if you have surnames and full names (or the
equivalent in
your area). -split- the fullnames and work with the
separate variables.

Clone one of the programs above but modify the code to look
for string
inclusion rather than strict equality.

Nick
[email protected]

Dalhia

I have a list of two variables: name1 and name2.  I
need to check if
name2 occurs in any of the name1s. The regexm command in
stata is
perfect for what I want to do, but it checks only one
string at a time,
and I need it to somehow rotate over a whole list of
names.

Here is what I  have:

name1
ram solanki
goel mehta
ashish gupta

name2
solanki
mehta

I need to be able to figure out that "solanki" and "mehta"
in name2
occur in name1 observation1 and observation2.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/






*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/




--
To every ?-consistent recursive class ? of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(?) (where v is the free variable of r).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/




--
Thomas Speidel


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: "sounds like" function in Stata
  - From: Dalhia <[email protected]>
- Re: st: "sounds like" function in Stata
  - From: Tirthankar Chakravarty <[email protected]>

Prev by Date: Re: st: AW: rationalizing multiple ids for the same name
Next by Date: Re: st: "sounds like" function in Stata
Previous by thread: Re: st: "sounds like" function in Stata
Next by thread: [no subject]
Index(es):
- Date
- Thread