Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Remove prefixes (e.g., >, <, and +/-) from numbers stored as strings


From   Richard Herron <richard.c.herron@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Remove prefixes (e.g., >, <, and +/-) from numbers stored as strings
Date   Fri, 8 Jun 2012 14:12:10 -0400

Thanks, David! That's big. I hadn't noticed the -ignore()- option in -destring-.

But what if I don't know the set of possible prefixes? I guess
-destring- will throw an error and I iteratively improve my filter?

I have some where +/- is almost like a LaTeX \pm symbol where the + is
stacked on the -. I think this is unicode U+00B1.
http://www.fileformat.info/info/unicode/char/b1/index.htm

Can I use -destring- to -ignore()- these?

Thanks!

Richard Herron


On Fri, Jun 8, 2012 at 1:59 PM, David Radwin <dradwin@mprinc.com> wrote:
> Can you use -destring- with the -ignore- option like this?
>
> . destring myvariable, ignore("+/-<>") generate(myvariable2)
>
> David
> --
> David Radwin
> Senior Research Associate
> MPR Associates, Inc.
> 2150 Shattuck Ave., Suite 800
> Berkeley, CA 94704
> Phone: 510-849-4942
> Fax: 510-849-0794
>
> www.mprinc.com
>
>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
>> statalist@hsphsun2.harvard.edu] On Behalf Of Richard Herron
>> Sent: Friday, June 08, 2012 10:30 AM
>> To: statalist@hsphsun2.harvard.edu
>> Subject: st: Remove prefixes (e.g., >, <, and +/-) from numbers stored as
>> strings
>>
>> I have numbers stored as string with prefixes (e.g., "+/-30") that I
>> would like to convert to numbers. Not all entries necessarily have
>> prefixes (or postfixes).
>>
>> With -regexm()- and -regexs()- I can remove from postfixes and handle
>> decimals, but I can't remove prefixes. Can you spot my error with
>> -regexm()-? Thanks!
>>
>> Richard Herron
>>
>> * begin code
>> clear
>> set obs 20
>> generate number = 100*runiform()
>> generate prefix = ""
>> generate postfix = ""
>> foreach i of numlist 1 5 10 15 {
>>     replace prefix = ">" in `i'
>>     replace postfix = "%" in `=`i' + 1'
>>     replace number = int(number) in `=`i' + 2'
>> }
>> egen combo = concat(prefix number postfix)
>> generate number2 = regexs(1) if regexm(combo, "([0-9]*\.?[0-9]*)")
>> list
>> * end code
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index