[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: identifying letters in a string variable |

Date |
Thu, 1 Sep 2005 00:35:09 +0100 |

I didn't read the question carefully enough. What is asked for is something like egen N = sieve(strvar), char(0123456789) egen S = sieve(strvar), keep(a) gen catvar = 2 replace catvar = 1 if N == strvar replace catvar = 3 if S == strvar Nick n.j.cox@durham.ac.uk Nick Cox > A loose test of whether a string variable has only > numbers is that -real(strvar)- is not missing, > always remembering the possibility of missing > values. > > real("") > > and > > real(".") > > both return numeric missing. > > In the -egenmore- package on SSC there is > a function -sieve()- which may help here. > > sieve(strvar) , { keep(classes) | char(chars) | omit(chars) } > selects characters from strvar according to a specified criterion > and generates a new string variable containing only those characters. > This may be done in three ways. First, characters are classified using > the keywords alphabetic (any of a-z or A-Z), numeric (any of 0-9), > space or other. keep() specifies one or more of those classes: > keywords may be abbreviated by as little as one letter. Thus > keep(a n) > selects alphabetic and numeric characters and omits spaces and other > characters. Note that keywords must be separated by spaces. > Alternatively, > char() specifies each character to be selected or omit() > specifies each > character to be omitted. Thus char(0123456789.) selects numeric > characters and the stop (presumably as decimal point); omit(" > ") strips > spaces and omit(`"""') strips double quotes. (Stata 7 required.) > > So you could look at a string variable like this. > > egen N = sieve(strvar), keep(n) > capture assert N == strvar > if _rc { > // characters present > egen S = sieve(strvar), keep(a) > capture assert S == strvar > if _rc { > // must be a mixture > <code for this case> > } > else { > // must be all string > <code for this case> > } > else { > // must be all numeric > <code for this case> > } > drop N S > > Nick > n.j.cox@durham.ac.uk > > TEWODAJ MOGUES > > > I looked through the string functions to try to find out > > which variable > > values of a string variable has letters plus numbers, only > > letters, and > > only numbers, but didn't come up with anything. E.g. > suppose i wanted > > to create a categorical variable that takes on 1 when stringvar has > > only numbers, 2 if a mix of numbers and letter, and 3 if > only letters: > > > > stringvar catvar > > 1 1 > > 12 1 > > id14 2 > > run 3 > > 5K 2 > > SPRINT 3 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: identifying letters in a string variable** - Next by Date:
**st: RE: Estimating glm w/log and gaussian,was RE: transforming predictions from loglinear models** - Previous by thread:
**st: RE: identifying letters in a string variable** - Next by thread:
**st: RE: Estimating glm w/log and gaussian,was RE: transforming predictions from loglinear models** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |