Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: strings

From   Nick Cox <>
To   "''" <>
Subject   st: RE: strings
Date   Wed, 1 Feb 2012 12:23:08 +0000

Answers imbedded below. In general, "forcing" Stata is not a good way to think! 



i have 2 questions:

i am using split command to divide my string variables into parts, is
there any way to force the split only by last occurrence of the split

e.g. if strings are like "ABC BLINCAR COMPANY INC" and i want remove
the "INC" from all the strings. if i use split, p(INC) i will get "ABC

NJC>>> That's not really a -split- problem. "INC" is not a string separator here. I am credited as the original author of -split- so I can tell you that it was not designed for this. 

The easiest recipe (!) I can think of is 

gen reversed = reverse(company)
replace reversed = subinstr(reverse, "CNI ", "", 1) if substr(reversed, 1, 4) == "CNI "  
replace company = reverse(reversed) 

That zaps " INC" if and only if it is the last four characters of your variable. 

The three commands above could be telescoped into one with some loss of clarity. 

I can believe that this may not delete all you want to delete. 

2. is there any way to force stata to ignore letters case when
comparing strings?
e.g. if i merge 2 files by string variable i want that name "ROGER"
and name "Roger" would be recognized as the same string

NJC>>> In general, you have to clean up inconsistencies before -merge-. -merge- has a difficult enough job as it is! 

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index