Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: problem with regexm leading to "regexp: unterminated ()" error for all observations

From   Phil Schumm <>
Subject   Re: st: problem with regexm leading to "regexp: unterminated ()" error for all observations
Date   Fri, 3 Jun 2011 12:10:53 -0500

On Jun 3, 2011, at 7:35 AM, Jamie Fagg wrote:
I've a problem with the function -regexm-. I get the following message:

regexp: unterminated ()


#delimit ;

//regular expression to define whether postcode is syntactically correct

ge postcodevalid = 1 if regexm(postcode,"(GIR 0AA)|(((A[BL]| B[ABDHLNRSTX]
[0-9])|EC[1-9][0-9]) [0-9][ABD-HJLNP-UW-Z]{2})")==1;

I'm not sure why Stata chokes on this, though I would suspect it might have something to do with the length. As Nick and Steven have already noted, the repeat qualifier {n} is not supported by Stata's regular expression syntax, so you'll need to replace


with the equivalent


Now, Nick suggested breaking the expression up, so let's do that. Your expression is equal to


where the individual parts (as assigned to Stata macros) are

    loc p1    "GIR 0AA"
    loc p2a1d "[1-9]?[0-9]"
    loc p2a2  "((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]"
    loc p2a3  "(SW|W)([2-9]|[1-9][0-9])"
    loc p2a4  "EC[1-9][0-9]"
    loc p2b   " [0-9][ABD-HJLNP-UW-Z][ABD-HJLNP-UW-Z]"

This may then be easily broken up as follows:

    gen byte valid = regexm(postcode,"`p1'")
    replace valid = 1 if regexm(postcode,"`p2a1a'`p2a1d'`p2b'")
    replace valid = 1 if regexm(postcode,"`p2a1b'`p2a1d'`p2b'")
    replace valid = 1 if regexm(postcode,"`p2a1c'`p2a1d'`p2b'")
    replace valid = 1 if regexm(postcode,"(`p2a2'|`p2a3'|`p2a4')`p2b'")

-- Phil

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index