Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Martin Weiss" <martin.weiss1@gmx.de> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: RE: FW: Using regex to identify strings with capital letters |

Date |
Wed, 26 May 2010 22:28:26 +0200 |

<> So the regular expressions may be back in business, as earlier suggested by Erik: *********** di regexm(substr("erik in lower case",1,2) , "^([A-Z][A-Z])") di regexm(substr("Erik in lower case",1,2) , "^([A-Z][A-Z])") di regexm(substr("ERik in lower case",1,2) , "^([A-Z][A-Z])") *********** HTH Martin -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Martin Weiss Sent: Mittwoch, 26. Mai 2010 22:18 To: statalist@hsphsun2.harvard.edu Subject: st: RE: RE: FW: Using regex to identify strings with capital letters <> Erik does have a point, though, in that Nick`s -inrange()- proposal seems to check for the first character only: *********** di inrange(substr("erik in lower case",1,2) , "AA", "ZZ") di inrange(substr("Erik in lower case",1,2) , "AA", "ZZ") *********** BTW, why was -di inrange("erik in lower case", "AA", "ZZ")- a good example earlier, even though the -substr()- part was missing? HTH Martin -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox Sent: Mittwoch, 26. Mai 2010 20:42 To: statalist@hsphsun2.harvard.edu Subject: st: RE: FW: Using regex to identify strings with capital letters Not true of my Stata: . di inrange("erik in lower case", "AA", "ZZ") 0 I think -- you've heard this before -- we need to see your code and some of your results, not your speculation about what might be happening. Nick n.j.cox@durham.ac.uk Beecroft, Erik (VDSS) I tried Nick's suggestion, pasted below, but inrange does not seem to distinguish between lower and upper case. In other words, the statement below keeps all observations that begin with two letters, whether capital or lower case. Nick Cox You don't need regex for this. ... if inrange(substr(myvar,1,2), "AA", "ZZ") should be enough, or even "AK" to "WY" or whatever it is. (Remember this is an international list!) From: Beecroft, Erik (VDSS) I need to extract certain observations from a series of text files. Each file contains only one variable, which is string. The observations I want all begin with two capital letters. (They are state abbreviations, such as VA or AK). The other observations do not begin with two capital letters. Is there a way to tell Stata to keep only observations for which the variable begins with two capital letters? It seems like the regex function might work, but I have never worked with regular expression syntax before. For example, a portion of a text file might look like: text1 text2 VA department of Social Services text4 text5 I want to keep only the third observation above. I am using Stata for Windows 10.1. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: FW: Using regex to identify strings with capital letters***From:*"Beecroft, Erik (VDSS)" <erik.beecroft@dss.virginia.gov>

**st: RE: FW: Using regex to identify strings with capital letters***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**st: RE: RE: FW: Using regex to identify strings with capital letters***From:*"Martin Weiss" <martin.weiss1@gmx.de>

- Prev by Date:
**st: RE: RE: FW: Using regex to identify strings with capital letters** - Next by Date:
**Re: st: Understanding Factor variables - is order significant ?** - Previous by thread:
**st: RE: RE: FW: Using regex to identify strings with capital letters** - Next by thread:
**st: RE: RE: RE: FW: Using regex to identify strings with capital letters** - Index(es):