Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Remove prefixes (e.g., >, <, and +/-) from numbers stored as strings


From   "David Radwin" <dradwin@mprinc.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Remove prefixes (e.g., >, <, and +/-) from numbers stored as strings
Date   Fri, 8 Jun 2012 11:32:41 -0700 (PDT)

Richard,

On the first question, I imagine there is a way to -ignore- all non-numeric 
characters, perhaps by using a loop to create a macro of all the numeric 
ASCII codes (see below), but I doubt it would be worth the effort. Your 
intuition is probably the better route.

On the second question, you can specify the ASCII character for -destring- 
to ignore, which in this case is 177.

. destring myvariable, ignore("`=char(177)'")

I use -asciiplot- by Michael Blasnik, Svend Juul, and Nicholas J. Cox and 
available from SSC as a convenient way to identify the ASCII numbers of such 
characters.

David
--
David Radwin
Senior Research Associate
MPR Associates, Inc.
2150 Shattuck Ave., Suite 800
Berkeley, CA 94704
Phone: 510-849-4942
Fax: 510-849-0794

www.mprinc.com


> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
> statalist@hsphsun2.harvard.edu] On Behalf Of Richard Herron
> Sent: Friday, June 08, 2012 11:12 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: RE: Remove prefixes (e.g., >, <, and +/-) from numbers
> stored as strings
>
> Thanks, David! That's big. I hadn't noticed the -ignore()- option in -
> destring-.
>
> But what if I don't know the set of possible prefixes? I guess
> -destring- will throw an error and I iteratively improve my filter?
>
> I have some where +/- is almost like a LaTeX \pm symbol where the + is
> stacked on the -. I think this is unicode U+00B1.
> http://www.fileformat.info/info/unicode/char/b1/index.htm
>
> Can I use -destring- to -ignore()- these?
>
> Thanks!
>
> Richard Herron
>
>
> On Fri, Jun 8, 2012 at 1:59 PM, David Radwin <dradwin@mprinc.com> wrote:
> > Can you use -destring- with the -ignore- option like this?
> >
> > . destring myvariable, ignore("+/-<>") generate(myvariable2)
> >
> > David
> > --
> > David Radwin
> > Senior Research Associate
> > MPR Associates, Inc.
> > 2150 Shattuck Ave., Suite 800
> > Berkeley, CA 94704
> > Phone: 510-849-4942
> > Fax: 510-849-0794
> >
> > www.mprinc.com
> >
> >
> >> -----Original Message-----
> >> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
> >> statalist@hsphsun2.harvard.edu] On Behalf Of Richard Herron
> >> Sent: Friday, June 08, 2012 10:30 AM
> >> To: statalist@hsphsun2.harvard.edu
> >> Subject: st: Remove prefixes (e.g., >, <, and +/-) from numbers stored
> as
> >> strings
> >>
> >> I have numbers stored as string with prefixes (e.g., "+/-30") that I
> >> would like to convert to numbers. Not all entries necessarily have
> >> prefixes (or postfixes).
> >>
> >> With -regexm()- and -regexs()- I can remove from postfixes and handle
> >> decimals, but I can't remove prefixes. Can you spot my error with
> >> -regexm()-? Thanks!
> >>
> >> Richard Herron
> >>
> >> * begin code
> >> clear
> >> set obs 20
> >> generate number = 100*runiform()
> >> generate prefix = ""
> >> generate postfix = ""
> >> foreach i of numlist 1 5 10 15 {
> >>     replace prefix = ">" in `i'
> >>     replace postfix = "%" in `=`i' + 1'
> >>     replace number = int(number) in `=`i' + 2'
> >> }
> >> egen combo = concat(prefix number postfix)
> >> generate number2 = regexs(1) if regexm(combo, "([0-9]*\.?[0-9]*)")
> >> list
> >> * end code


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index