Re: st: getting part of strings

Daniel Marcelino
To   [email protected]
Re: st: getting part of strings
Sun, 27 Mar 2011 12:57:55 -0300

I get it. However this thread lead me to an old issue in my mind, how
take out language marks (accent) from strings replacing by single
letter, like "Ô" for "O" or "È" for "E".
So, maybe I can store a local table with correspondence letters and
run it in a loop for each line of string var. What you think about it?

inp str200 var1
"1212 - DAMIÃO FELICIANO DA SILVA - PB - Deputado Federal"

// table accent
local accent = {
  ['á'] = 'a',
  ['à'] = 'a',
  ['ã'] = 'a',
  ['é'] = 'e',
  ['è'] = 'e',
  ['É'] = 'E',
  ['Ó'] = 'O',
  ['í'] = 'i',
  ['Í'] = 'I',
  ['ü'] = 'u',
  ['Ü'] = 'U',

On Sun, Mar 27, 2011 at 1:17 AM, Eric Booth wrote:
> <>
> On Mar 26, 2011, at 10:10 PM, Rebecca Pope wrote:
>> Daniel,
>> You could try using char(). The ASCII equivalent to "A" is 69; for "Z"
>> it is 90. Maybe something like this would work for you (piggy-backing
>> on Nick's earlier suggestion):
>> clonevar copy = var1
>> replace copy = upper(copy)
>> qui forval i = 69/90 {
>>     local letter = char(`i')
>>     replace copy = subinstr(copy, "`letter'", "", .)
>> }
> Another option is to use c(alpha) and c(ALPHA) for standard alpha characters
> ********modifying NJC's example:
> clonevar copy = var1
> qui foreach i in `c(alpha)' `c(ALPHA)'  {
>           replace copy = subinstr(copy, "`i'", "", .)
> }
> *******
>> This won't work for all of your text (e.g. Ã). I don't know of any way
>> to look the numeric values up in Stata, so I'll plug a previous post
>> by Nick
>> ( and
>> advise you to look up the ASCII codes for any accented letters by
>> searching the internet for "ANSI character code chart". You'll need to
>> modify the code above to add any additional numbers you need & switch
>> to -foreach- with -numlist-.
> Take a look at -ascii- and -asciiplot- from SSC.
> Also, you can get a list of all the chars used in var1 with -charlist- from SSC.
- Eric
