RE: st: -word()- with non space separator

Wed, 23 Sep 2009 18:43:05 +0100

Not knowing the highest value in advance would bite equally hard with the method in your previous post, which works from 1 upwards to a specified maximum, so that objection seems unconvincing to me. Nick n.j.cox@durham.ac.uk Jeph Herrin Thanks. I also thought of something like this, but didn't want to pursue it, if that makes sense. For one thing, I have literally thousands of variables and don't know ahead of time what the highest number I need is. As for the structure, it may not be the worst, but it is surely not the best. Nick Cox wrote: > Another way to do it: > > clonevar work = myvar > > qui forval i = 29(-1)1 { > gen myvar_`i' = strpos(work, "`i'") > 0 > replace work = subinstr(work, "`i'", "", .) > } > > Here 29 is in general whatever highest number you need. > > In words, in addition to the -strpos()- logic, > > 1. Work on a copy, because we're going to change it. > > 2. Work downwards, from high values down to 1. > > 3. Once you've checked for a longer string, zap it so that it doesn't > later confuse the search for shorter strings. > > Incidentally, don't knock the format (or structure). When Uli Kohler and > I wrote up the tricks we knew for multiple responses (in this sense), it > was pretty clear to us that all such formats or structures have some big > advantages and disadvantages. Our efforts are accessible at > > FAQ . . . . . . . . . . . . . . . . . . . Dealing with multiple > responses > . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and U. > Kohler > 4/05 How do I deal with multiple responses? > http://www.stata.com/support/faqs/data/multresp.html > > SJ-3-1 pr0008 Speaking Stata: On structure & shape: the case of mult. > resp. > . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox & U. > Kohler > Q1/03 SJ 3(1):81--99 (no > commands) > discussion of data manipulations for multiple response data > > Nick > n.j.cox@durham.ac.uk > > Jeph Herrin > > Solved - this does it: > > forv i=1/9 { > gen byte myvar_`i'= regexm(myvar,"^`i':|:`i':|:`i'$") > } > > > Jeph Herrin wrote: > >> I have a dataset in which many variables are in >> the most useless format imaginable. If a question >> has multiple checkboxes as possible answers, the >> response is stored as a string, with a number indicating >> each box checked and these numbers separated by colons. >> Thus: >> >> myvar >> 1:2:3:5:6:7:8:9 >> 1:2:3:6 >> 1:2:3:4:5:7:8:9 >> 1:2:3:5:7:9 >> 1:2:3:5:7:8:9 >> 2:3:4:6:9 >> 1:2:3:5:6:7:8:9 >> 1:2:7:8:9 >> 7:9 >> >> This variable takes 9 values, so I want to split into 9 >> different indicator variables, myvar_1-myvar_9, each >> indicating whether that number was selected. -split()- >> does not work, because of the differing number of values >> per string. That is, it produces myvar_1 which equals "7" >> for the last obs. >> >> So I am looking for a way to check whether a given string >> contains a given integer, which would allow me to >> >> forv i=1/9 { >> gen byte myvar_`i'= [`i' is in myvar list] >> } >> >> As long as there are just 9 values, I can use -strpos()- >> to check for the presence of the digit, but some of my variables >> run into tens and twenties, in which case eg searching for "1" >> returns true even if there is only "11". >> >> The only solutions I see are to first -split()- and >> then check all the new indicators, or run through a series of >> checks such as (matches "1:" but not ":1"). I don't like >> either: Is there a direct way to check to see if a given integer >> is in the list? >> >> I think there may be a regex solution, but my Perl programming >> days are so far behind me that I've not been able to come up >> with one. > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

