[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: "sounds like" function in Stata

From   Dalhia <>
Subject   st: "sounds like" function in Stata
Date   Tue, 18 Aug 2009 11:55:45 -0700 (PDT)

Hi, Is there some kind of "sounds like" function in Stata? I have a list of companies but the names are sometimes a little different. Example AOL Time Warner also appears as AOL, Time Warner, and Time Warner Inc. I need a method to figure out that all these are the same entity, and none of the string functions in Stata seem to do what I want. Do any of you have any suggestions. Here is how the data looks like:


AOL Time Warner
Time Warner Inc
Microsoft Inc

Ideally, what I would like is some way to indicate which names are similar. For example:

Name, Similarity

AOL, 1
AOL Time Warner, 1
Time Warner Inc, 1
Microsoft, 2
Microsoft Inc, 2
Microsft, 2

Any help will be much appreciated. 

--- On Fri, 6/5/09, Nick Cox <> wrote:

> From: Nick Cox <>
> Subject: st: RE: appling string functions across observations
> To:
> Date: Friday, June 5, 2009, 3:00 PM
> Check out -fndmtch2- or -fndmtch-
> from SSC. At first sight they don't
> address this problem, but there are at least two ways
> forward: 
> It sounds as if you have surnames and full names (or the
> equivalent in
> your area). -split- the fullnames and work with the
> separate variables. 
> Clone one of the programs above but modify the code to look
> for string
> inclusion rather than strict equality. 
> Nick 
> Dalhia
> I have a list of two variables: name1 and name2.  I
> need to check if
> name2 occurs in any of the name1s. The regexm command in
> stata is
> perfect for what I want to do, but it checks only one
> string at a time,
> and I need it to somehow rotate over a whole list of
> names.  
> Here is what I  have:  
> name1  
> ram solanki 
> goel mehta
> ashish gupta
> name2
> solanki
> mehta
> I need to be able to figure out that "solanki" and "mehta"
> in name2
> occur in name1 observation1 and observation2.  
> *
> *   For searches and help try:
> *
> *
> *


*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index