Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: getting part of strings

From	Daniel Marcelino <[email protected]>
To	[email protected]
Subject	Re: st: getting part of strings
Date	Sun, 27 Mar 2011 12:57:55 -0300

I get it. However this thread lead me to an old issue in my mind, how
take out language marks (accent) from strings replacing by single
letter, like "Ô" for "O" or "È" for "E".
So, maybe I can store a local table with correspondence letters and
run it in a loop for each line of string var. What you think about it?

/****/
clear
inp str200 var1
"45123 - ANTÔNIO HERVÁZIO BEZERRA CAVALCANTI - PB - Deputado Estadual"
"1212 - DAMIÃO FELICIANO DA SILVA - PB - Deputado Federal"
end

// table accent
local accent = {
  ['á'] = 'a',
  ['à'] = 'a',
  ['ã'] = 'a',
  ['é'] = 'e',
  ['è'] = 'e',
  ['É'] = 'E',
  ['Ó'] = 'O',
  ['í'] = 'i',
  ['Í'] = 'I',
  ['ü'] = 'u',
  ['Ü'] = 'U',
}



On Sun, Mar 27, 2011 at 1:17 AM, Eric Booth <[email protected]> wrote:
> <>
>
> On Mar 26, 2011, at 10:10 PM, Rebecca Pope wrote:
>
>> Daniel,
>> You could try using char(). The ASCII equivalent to "A" is 69; for "Z"
>> it is 90. Maybe something like this would work for you (piggy-backing
>> on Nick's earlier suggestion):
>>
>> clonevar copy = var1
>> replace copy = upper(copy)
>> qui forval i = 69/90 {
>>     local letter = char(`i')
>>     replace copy = subinstr(copy, "`letter'", "", .)
>> }
>
> Another option is to use c(alpha) and c(ALPHA) for standard alpha characters
> ********modifying NJC's example:
> clonevar copy = var1
> qui foreach i in `c(alpha)' `c(ALPHA)'  {
>           replace copy = subinstr(copy, "`i'", "", .)
> }
> *******
>
>>
>> This won't work for all of your text (e.g. Ã). I don't know of any way
>> to look the numeric values up in Stata, so I'll plug a previous post
>> by Nick
>> (http://www.stata.com/statalist/archive/2006-12/msg00446.html) and
>> advise you to look up the ASCII codes for any accented letters by
>> searching the internet for "ANSI character code chart". You'll need to
>> modify the code above to add any additional numbers you need & switch
>> to -foreach- with -numlist-.
>
> Take a look at -ascii- and -asciiplot- from SSC.
> Also, you can get a list of all the chars used in var1 with -charlist- from SSC.
>
>
> - Eric
>
> __
> Eric A. Booth
> Public Policy Research Institute
> Texas A&M University
> [email protected]
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: getting part of strings
  - From: Nick Cox <[email protected]>

References:
- st: getting part of strings
  - From: Daniel Marcelino <[email protected]>
- Re: st: getting part of strings
  - From: Eric Booth <[email protected]>
- Re: st: getting part of strings
  - From: Eric Booth <[email protected]>
- Re: st: getting part of strings
  - From: Daniel Marcelino <[email protected]>
- Re: st: getting part of strings
  - From: Nick Cox <[email protected]>
- Re: st: getting part of strings
  - From: Daniel Marcelino <[email protected]>
- Re: st: getting part of strings
  - From: Rebecca Pope <[email protected]>
- Re: st: getting part of strings
  - From: Eric Booth <[email protected]>

Prev by Date: RE: st: RE: ivregress with2sls and clustered standard errors
Next by Date: Re: st: Use of matrix values in generate statements
Previous by thread: Re: st: getting part of strings
Next by thread: Re: st: getting part of strings
Index(es):
- Date
- Thread