[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Seeking help in stata

From	"Anders Alexandersson" <[email protected]>
To	[email protected]
Subject	Re: st: Seeking help in stata
Date	Fri, 14 Dec 2007 10:55:35 -0500

Ah, thanks Nick. I forgot in the haste how to create regular sums. I meant

egen found = total( regexm(codeks, "A") + regexm(codeks, "X") ) >= 1

Having now read the FAQ for regular expressions at
http://www.stata.com/support/faqs/data/regex.html
it seems that regexm() uses the pipe character for logical or, so I
also suggest this solution:
gen found = regexm(codeks, ["A" | "X"])

Anders Alexandersson
[email protected]

Nick Cox <[email protected]> wrote:
> Anders' solution makes use of -sum()-. That would cumulate
> from observation to observation. It sounds to me as if
> Thaddee wants to look at each observation separately.
>
> See also my solution suggested earlier.
>
> (Stata had an -index()- function, but from Stata 10 it is available
> only under version control. -strpos()- is now the equivalent.)
>

> Thaddee Badibanga <[email protected]> wrote:
>
> > I'd like to create an index from a
> > variable which is a pseudo numeric or a string(numeric
> > as well character). This index will allow me to
> > eliminate some observations in the dataset. To give
> > you an idea, the variable I termed codeks is as
> > follows:
> > codeks:101 102 01A 01X 0AX ...103 ... 111 112 ...11111
> >
> > I'd like to create an index that assigns 1 if codeks
> > includes A or X or AX and 0 otherwise. I have done
> > this in other programs. In one program for instance,
> > this can be done as:
> > found=indexc(codeks,"A","X")
> >
> > I will really appreciate your help. I have spent more
> > than 3 hours without success.
>
> I am not aware of a similar function in Stata. But the regexm() string
> function combined with a Boolean expression should work. This FAQ
> explains Boolean expressions in Stata:
> http://www.stata.com/support/faqs/data/trueorfalse.html
> For example, regexm(codeks, "A") would evaluate to 1 if codeks has the
> string A, and to 0 otherwise.
>
> I have not tried the following, but I think it will work as you
> intended:
> gen found = sum( regexm(codeks, "A") + regexm(codeks, "X") ) >= 1
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: Seeking help in stata
  - From: "Nick Cox" <[email protected]>

References:
- st: Seeking help in stata
  - From: Thaddee Badibanga <[email protected]>
- Re: st: Seeking help in stata
  - From: "Anders Alexandersson" <[email protected]>
- RE: st: Seeking help in stata
  - From: "Nick Cox" <[email protected]>

Prev by Date: st: RE: RE: RE: [programming] syntax syntax
Next by Date: st: do file programming
Previous by thread: RE: st: Seeking help in stata
Next by thread: RE: st: Seeking help in stata
Index(es):
- Date
- Thread