Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: identifying strings that differ on one or two letters


From   Scott Merryman <[email protected]>
To   [email protected]
Subject   Re: st: identifying strings that differ on one or two letters
Date   Fri, 19 Nov 2010 08:55:05 -0600

You might try -soundex()-

. clear

. set obs 3
obs was 0, now 3

. generate str var1 = "Jayanthi chemicals" in 1
(2 missing values generated)

. replace var1 = "Jayanth chemicals" in 2
(1 real change made)

. replace var1 = "Jay chemicals" in 3
(1 real change made)

. gen soundex = soundex(var1)

. cl

                   var1    soundex
  1. Jayanthi chemicals       J532
  2.  Jayanth chemicals       J532
  3.      Jay chemicals       J252



On Fri, Nov 19, 2010 at 6:59 AM, Dalhia <[email protected]> wrote:
> Hello,
>
> Is there a method in stata to identify strings that differ by just one or two letters?
> For example:
>
> comp_name
>
> Jayanthi chemicals
> Jayanth chemicals
> Jay chemicals
>
> So here the first two should be identified since they differ by only one letter, but not the last one since it differs by 4 letters? Is there a way to do this in stata?
>
> thanks. I appreciate your help.
> dalhia
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index