Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: strings
Nick Cox <email@example.com>
Re: st: RE: strings
Thu, 2 Feb 2012 09:56:07 +0000
replace company = substr(company, 1, length(company) - 4) if
substr(company, -4, 4) == " INC"
is a better way to remove any training " INC".
On Wed, Feb 1, 2012 at 12:23 PM, Nick Cox <firstname.lastname@example.org> wrote:
That's not really a -split- problem. "INC" is not a string separator
here. I am credited as the original author of -split- so I can tell
you that it was not designed for this.
> The easiest recipe (!) I can think of is
> gen reversed = reverse(company)
> replace reversed = subinstr(reverse, "CNI ", "", 1) if substr(reversed, 1, 4) == "CNI "
> replace company = reverse(reversed)
> That zaps " INC" if and only if it is the last four characters of your variable.
> The three commands above could be telescoped into one with some loss of clarity.
> I can believe that this may not delete all you want to delete.
> i am using split command to divide my string variables into parts, is
> there any way to force the split only by last occurrence of the split
> e.g. if strings are like "ABC BLINCAR COMPANY INC" and i want remove
> the "INC" from all the strings. if i use split, p(INC) i will get "ABC
> BL" instead of "ABC BLINCAR COMPANY".
> 2. is there any way to force stata to ignore letters case when
> comparing strings?
> e.g. if i merge 2 files by string variable i want that name "ROGER"
> and name "Roger" would be recognized as the same string
> NJC>>> In general, you have to clean up inconsistencies before -merge-. -merge- has a difficult enough job as it is!
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: