Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: regexm and boundary symbol \b


From   Jacob Wegelin <[email protected]>
To   [email protected]
Subject   st: regexm and boundary symbol \b
Date   Thu, 21 Dec 2006 18:16:07 -0800 (PST)

In many regular expression engines one can use the symbol \b to denote a
word boundary.  For instance, in unix, the following use of '\b' allows
us to select only those lines in a file that contain the letter 's'
where it stands alone, not next to any other letter.

UNIX> cat z
dogs and cats
sss
s, he said
george's crown
UNIX> egrep 's' z
dogs and cats
sss
s, he said
george's crown
UNIX> egrep '\bs\b' z
s, he said
george's crown
UNIX>

Is there a way to do this in Stata? The following attempt did not work:

. list

     +-----------------------+
     | var1             var2 |
     |-----------------------|
  1. |    1    dogs and cats |
  2. |    2              sss |
  3. |    3       s, he said |
  4. |    4   george's crown |
     +-----------------------+

. list if regexm(var2, "s")

     +-----------------------+
     | var1             var2 |
     |-----------------------|
  1. |    1    dogs and cats |
  2. |    2              sss |
  3. |    3       s, he said |
  4. |    4   george's crown |
     +-----------------------+

. list if regexm(var2, "\bs\b")

. list if regexm(var2, "\\bs\\b")

Thanks for any info

Jake Wegelin
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index