Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Seeking help in stata


From   "Anders Alexandersson" <andersalex@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Seeking help in stata
Date   Fri, 14 Dec 2007 12:28:02 -0500

If "A" and "X" are not the only alphabetic
characters, then

. list if missing(real(codeks))

will identify also these other alphabetical characters, which is not wanted.
Here is my final attempt of using my idea with regular expressions
(found1 is Nick's original solution, found2 is my solution, I hope):


. input str3 codeks

        codeks
  1. 101
  2. 102
  3. 01A
  4. 01X
  5. 0AX
  6. EFG
  7. end

. gen byte found1 = (strpos(codeks, "A") > 0) | (strpos(codeks, "X") > 0)

. gen byte found2 = regexm(codeks, ["A|X"] )

. list

     +--------------------------+
     | codeks   found1   found2 |
     |--------------------------|
  1. |    101        0        0 |
  2. |    102        0        0 |
  3. |    01A        1        1 |
  4. |    01X        1        1 |
  5. |    0AX        1        1 |
     |--------------------------|
  6. |    EFG        0        0 |
     +--------------------------+

. list if missing(real(codeks))

     +--------------------------+
     | codeks   found1   found2 |
     |--------------------------|
  3. |    01A        1        1 |
  4. |    01X        1        1 |
  5. |    0AX        1        1 |
  6. |    EFG        0        0 |
     +--------------------------+

. list if found2

     +--------------------------+
     | codeks   found1   found2 |
     |--------------------------|
  3. |    01A        1        1 |
  4. |    01X        1        1 |
  5. |    0AX        1        1 |
     +--------------------------+

Anders

On Dec 14, 2007 12:05 PM, Anders Alexandersson <andersalex@gmail.com> wrote:
> I prefer Gabi's and Nick's solution(s) too.
>
> Of course, regarding my own ideas, it is egen's rowtotal() function,
> not egen's total() function or generate's sum() function, that creates
> regular sums but that idea is moot by now. And to make bad things
> worse, from my perspective, I need to read up on how to use the
> logical operator in regular expressions in Stata.
>
> Thanks,
> Anders
>
>
> On Dec 14, 2007 11:40 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> > Yes, good idea. In so far as "A" and "X" are the only alphabetic
> > characters, then
> >
> > . list if missing(real(codeks))
> >
> > will identify any observations with "A", "X", "AX" within -codeks-,
> > and no others.
> > Note that no intermediate variable is needed for that purpose.
> >
> > Gabi Huiber
> >
> > This solution is not general, in that it assumes that all the codeks
> > values that do not include A, X, or AX are numeric strings. If that is
> > the case, Badi could simply do this:
> >
> > gen x=real(codeks)
> >
> > and x will show a missing value everywhere codeks shows A, X, or AX.
> >
> > Then it won't be too hard to say
> >
> > gen codeks_ax_dummy=(x==.)
> > drop x
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index