Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: st: -word()- with non space separator


From   Jeph Herrin <[email protected]>
To   [email protected]
Subject   Re: AW: st: -word()- with non space separator
Date   Wed, 23 Sep 2009 14:01:38 -0400

Thanks. But I noted in my original post that I saw a solution
using -split-, followed by running through the generated variables
- more or less along these lines. I was looking for something
more elegant. -regexm()- eventually yielded a simple solution.

cheers,
Jeph

Martin Weiss wrote:
<>
I would have recommended
http://www.stata-journal.com/article.html?article=dm0039, until I noticed
that you are one of the authors...


*************

clear*
input str20 stringanswer
"1:2:3:5:6:7:8:9"
"1:2:3:6"
"1:2:3:4:5:7:8:9"
"1:2:3:5:7:9"
"1:2:3:5:7:8:9"
"2:3:4:6:9"
"1:2:3:5:6:7:8:9"
"1:2:7:8:9"
"7:9"
"1:11:12"
end

split stringanswer, generate(comp) parse(:)
destring, replace

egen rowmaxim=rowmax(comp*)
su rowmaxim, mean

forv i=1/`r(max)'{
	egen byte my`i' = anymatch(comp*), values(`i')
}

drop comp* rowmaxim
*************



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Jeph Herrin
Gesendet: Mittwoch, 23. September 2009 17:11
An: [email protected]
Betreff: Re: st: -word()- with non space separator

THanks. As I note in the paragraph after my data snippet,
-strpos()- works as long as there are <=9 values, but doesn't
work when I get to multiple digits - strpos("11:12","1") = 1,
even though "1" is not really in the list.

cheers,
J

Eric A. Booth wrote:
I would use -strpos()-.

******
clear
input str20 var1
"1:2:3:5:6:7:8:9"
"1:2:3:6"
"1:2:3:4:5:7:8:9"
"1:2:3:5:7:9"
"1:2:3:5:7:8:9"
"2:3:4:6:9"
"1:2:3:5:6:7:8:9"
"1:2:7:8:9"
"7:9"
end
forval n = 1/9 {
    gen myvar_`n'=.
    gen ind`n' = strpos(var1, "`n'")
    replace myvar_`n'=1 if ind`n'>0
    drop ind`n'
    }
li var1 myvar_*

******

Best,

Eric

__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754

On Sep 23, 2009, at 9:29 AM, Jeph Herrin wrote:

I have a dataset in which many variables are in
the most useless format imaginable. If a question
has multiple checkboxes as possible answers, the
response is stored as a string, with a number indicating
each box checked and these numbers separated by colons.
Thus:

               myvar
     1:2:3:5:6:7:8:9
             1:2:3:6
     1:2:3:4:5:7:8:9
         1:2:3:5:7:9
       1:2:3:5:7:8:9
           2:3:4:6:9
     1:2:3:5:6:7:8:9
           1:2:7:8:9
                 7:9

This variable takes 9 values, so I want to split into 9
different indicator variables, myvar_1-myvar_9, each
indicating whether that number was selected. -split()-
does not work, because of the differing number of values
per string. That is, it produces myvar_1 which equals "7"
for the last obs.

So I am looking for a way to check whether a given string
contains a given integer, which would allow me to

  forv i=1/9 {
    gen byte myvar_`i'= [`i' is in myvar list]
  }

As long as there are just 9 values, I can use -strpos()-
to check for the presence of the digit, but some of my variables
run into tens and twenties, in which case eg searching for "1"
returns true even if there is only "11".

The only solutions I see are to first -split()- and
then check all the new indicators, or run through a series of
checks such as (matches "1:" but not ":1").  I don't like
either: Is there a direct way to check to see if a given integer
is in the list?

I think there may be a regex solution, but my Perl programming
days are so far behind me that I've not been able to come up
with one.

thanks,
Jeph



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index