[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Behaviour of -tokenize- shouldn't it drop the parsing character?

From	"David Elliott" <[email protected]>
To	[email protected]
Subject	st: Behaviour of -tokenize- shouldn't it drop the parsing character?
Date	Wed, 4 Oct 2006 18:44:56 -0300

I want to tokenize groups of numbers separated by the "|" character:
e.g.: 1 2 3 | 4 5 6| 7 8 | 9 so that I have each group in a positional macro
_1 = 1 2 3, _2 = 4 5 6 ...  However, I have found that tokenize does
not behave as I expected.

To better illustrate, under Stata 9.2/SE I did the following experiment:

. local test 1 2 3
. mac dir
. tokenize `"`test'"'

_3:             3
_2:             2
_1:             1
_test:          1 2 3

. local test 1 | 2 | 3
. tokenize `"`test'"' , parse("|")
. mac dir

_5:             3
_4:             I
_3:             2
_2:             |
_1:             1
_test:          1 | 2 | 3



*Should* tokenize be returning the parsing character as positional
macros in addition to the parsed text?  This appears to be the
behaviour regardless of the parsing character chosen although the
default parsing on " " does not do this.  It is not impossible to work
around (one could use a while "`i'" != "" { loop and increment `i' by
2 ), but is contrary to how I would expect the command to work.
Doubtless the Stata mavens have some good reason for it working in
this manner, but I fail to see how having a parsing character in the
positional macro is useful.  The -help tokenize- is silent on this
matter.  Is there something I am not grasping here or is there a
better way of doing this?

Many thanks.

--
David Elliott
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: Behaviour of -tokenize- shouldn't it drop the parsing character?
  - From: "Scott Merryman" <[email protected]>
- st: RE: Behaviour of -tokenize- shouldn't it drop the parsing character?
  - From: "Scott Merryman" <[email protected]>

Prev by Date: Re: st: using matrix expression
Next by Date: st: RE: Behaviour of -tokenize- shouldn't it drop the parsing character?
Previous by thread: st: RE: Tab command still showing old variable names after "rename" command
Next by thread: st: RE: Behaviour of -tokenize- shouldn't it drop the parsing character?
Index(es):
- Date
- Thread