Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Behaviour of -tokenize- shouldn't it drop the parsing character?


From   "David Elliott" <[email protected]>
To   [email protected]
Subject   st: Behaviour of -tokenize- shouldn't it drop the parsing character?
Date   Wed, 4 Oct 2006 18:44:56 -0300

I want to tokenize groups of numbers separated by the "|" character:
e.g.: 1 2 3 | 4 5 6| 7 8 | 9 so that I have each group in a positional macro
_1 = 1 2 3, _2 = 4 5 6 ...  However, I have found that tokenize does
not behave as I expected.

To better illustrate, under Stata 9.2/SE I did the following experiment:

. local test 1 2 3
. mac dir
. tokenize `"`test'"'

_3:             3
_2:             2
_1:             1
_test:          1 2 3

. local test 1 | 2 | 3
. tokenize `"`test'"' , parse("|")
. mac dir

_5:             3
_4:             I
_3:             2
_2:             |
_1:             1
_test:          1 | 2 | 3



*Should* tokenize be returning the parsing character as positional
macros in addition to the parsed text?  This appears to be the
behaviour regardless of the parsing character chosen although the
default parsing on " " does not do this.  It is not impossible to work
around (one could use a while "`i'" != "" { loop and increment `i' by
2 ), but is contrary to how I would expect the command to work.
Doubtless the Stata mavens have some good reason for it working in
this manner, but I fail to see how having a parsing character in the
positional macro is useful.  The -help tokenize- is silent on this
matter.  Is there something I am not grasping here or is there a
better way of doing this?

Many thanks.

--
David Elliott
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index