Anders' solution makes use of -sum()-. That would cumulate
from observation to observation. It sounds to me as if
Thaddee wants to look at each observation separately.
See also my solution suggested earlier.
(Stata had an -index()- function, but from Stata 10 it is available
only under version control. -strpos()- is now the equivalent.)
Anders Alexandersson
Thaddee Badibanga <[email protected]> wrote:
> I'd like to create an index from a
> variable which is a pseudo numeric or a string(numeric
> as well character). This index will allow me to
> eliminate some observations in the dataset. To give
> you an idea, the variable I termed codeks is as
> follows:
> codeks:101 102 01A 01X 0AX ...103 ... 111 112 ...11111
>
> I'd like to create an index that assigns 1 if codeks
> includes A or X or AX and 0 otherwise. I have done
> this in other programs. In one program for instance,
> this can be done as:
> found=indexc(codeks,"A","X")
>
> I will really appreciate your help. I have spent more
> than 3 hours without success.
I am not aware of a similar function in Stata. But the regexm() string
function combined with a Boolean expression should work. This FAQ
explains Boolean expressions in Stata:
http://www.stata.com/support/faqs/data/trueorfalse.html
For example, regexm(codeks, "A") would evaluate to 1 if codeks has the
string A, and to 0 otherwise.
I have not tried the following, but I think it will work as you
intended:
gen found = sum( regexm(codeks, "A") + regexm(codeks, "X") ) >= 1
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/