Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Sergiy Radyakin <serjradyakin@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: RE: Test position of a whole word within a macro |

Date |
Tue, 3 Dec 2013 19:57:29 -0500 |

I might be wrong, but I guess the original poster means: token("var12 var1 var2 var3", "var2")==>3 (pseudo function) which returns the word number of the "var2" in the list determined by the first argument. This is similar to what Jack Newsham was just asking in a neighboring thread (too new so not yet in the statalist archive to be quoted). But we knew that in his case all US states were abbreviated to 2 letters, plus a space separator, so in the code I posted, I was dividing by 3. In your case it seems you are asking a more generic question than you need (you are asking about arbitrary words, but your words are variables, and this is important). Hence you can just construct your original list to be in 32+1 format: "var12........spaces to fill 32.... var1........spaces to fill 32. etc" then use the same approach, but divide not by 3 but by 33. Obviously constructing the (32+1)-formatted list is easily automatable, so you don't have to count spaces in the editor. String functions in Stata 13 should work with really big strings, so this approach should now work very well. In general the idea of this approach in both cases is to delegate the search to strpos() function, which saves you a loop and the need to compare things. The cost is that you must be able to recover the intended content by position (result of the strpos()) and uniqueness (in Jack's case all states were unique and of same length, in yours you should decorate the search list and target with spaces to match whole words only). IMHO Stata is missing the very helpful table() function present e.g. in GPSS (yes here I mean really GPSS, not SPSS), which allowed to build indirect references, and even interpolate (as far as I remember). It is trivial to implement, but not possible, since Stata does not allow defining user-functions (to be used in Stata-language expressions). A valid substitute is usually a -recode- statement or a series of replaces, but it almost always results in multiple statements where one should be enough. Best, Sergiy Radyakin On Tue, Dec 3, 2013 at 5:53 PM, Sarah Edgington <sedging@ucla.edu> wrote: > Brent, > Forgive me if I am misunderstanding what you are trying to do but it looks > like from your initial example that you are trying to count the number of > words in string. > If that is in fact what you're trying to do, you might try the function > wordcount. So you'd have something like: > local test=wordcount("var12 var3 var1") > > -Sarah > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Brent McSharry > (ADHB) > Sent: Tuesday, December 03, 2013 2:36 PM > To: statalist@hsphsun2.harvard.edu > Subject: st: Test position of a whole word within a macro > > Dear statalisters > > I need to determine the position (in words, not characters) a word is within > a macro. For example: > > local test = *wordposition*("var12 var3 var1") > > local test should have a value of 3. > > It will be used within a loop which will be called hundreds of times, so it > must be performat, and for this reason I will only use a foreach loop if > this is the only way. > > The programming (ie not stata) solution would be to create a 'dictionary'/ > hash table of the string values. > > The pseudo-code for what I am trying to do () is . matrix `outmat' = > J(`indepcount', 1,0) . matrix rownames `outmat' = `indepvars' > . forvalues i=1(1)`iterations' { > . //reassign division of development and testing data sets > . //build model & exclude unwanted variables > . local included:colfullnames(e(b)) > . //remove _cons from `included' > . foreach v in `included' { > . //if vectors could be referred to by string, one would use > . // `outmat'[`v'] = outmat'[`v'] + 1 > . //I don't think the above is possible in stata, so instead: > . local i = //which word > . `outmat'[`i'] = outmat'[`i'] + 1 > . } > . } > > Does anyone have any ideas on a performant way to test for an exact word > match? Thank you > > Brent McSharry > Intensivist > Starship Hospital Auckland > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Test position of a whole word within a macro***From:*"Brent McSharry (ADHB)" <BrentM@adhb.govt.nz>

**st: RE: Test position of a whole word within a macro***From:*"Sarah Edgington" <sedging@ucla.edu>

- Prev by Date:
**Re: st: RE: Generating a value depending on filename** - Next by Date:
**Re: st: RE: Generating a value depending on filename** - Previous by thread:
**st: RE: RE: Test position of a whole word within a macro** - Next by thread:
**st: xtivreg first stage questions** - Index(es):