Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: strings

From   Nick Cox <>
To   "''" <>
Subject   st: RE: strings
Date   Wed, 1 Feb 2012 12:23:08 +0000

Answers imbedded below. In general, "forcing" Stata is not a good way to think! 



i have 2 questions:

i am using split command to divide my string variables into parts, is
there any way to force the split only by last occurrence of the split

e.g. if strings are like "ABC BLINCAR COMPANY INC" and i want remove
the "INC" from all the strings. if i use split, p(INC) i will get "ABC

NJC>>> That's not really a -split- problem. "INC" is not a string separator here. I am credited as the original author of -split- so I can tell you that it was not designed for this. 

The easiest recipe (!) I can think of is 

gen reversed = reverse(company)
replace reversed = subinstr(reverse, "CNI ", "", 1) if substr(reversed, 1, 4) == "CNI "  
replace company = reverse(reversed) 

That zaps " INC" if and only if it is the last four characters of your variable. 

The three commands above could be telescoped into one with some loss of clarity. 

I can believe that this may not delete all you want to delete. 

2. is there any way to force stata to ignore letters case when
comparing strings?
e.g. if i merge 2 files by string variable i want that name "ROGER"
and name "Roger" would be recognized as the same string

NJC>>> In general, you have to clean up inconsistencies before -merge-. -merge- has a difficult enough job as it is! 

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index