Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Problem parsing strings that contain "$CHAR"

From	<[email protected]>
To	<[email protected]>
Subject	st: Problem parsing strings that contain "$CHAR"
Date	Tue, 20 Apr 2010 15:01:51 -0400

Dear Stata users,

I have to parse the lines of SAS syntax files that contain 1) the position of a variable in a raw data file, 2) the name of the variable, and 3) the data type of the variable. Examples of such lines are:

@04075    Variable_B    $5.
@04080    Variable_C    $CHAR6.

Both "$#" and "$CHAR#" mean that the variable is a string variable of # characters. I have no problem parsing the first line above, but am looking for advice on what to do with lines that contain a sequence like "$CHAR6". Such sequences appear to be evaluated first and before the string is parsed, and results in "", as in below:

. loc line @04080    Variable_C    $CHAR6.
. di "`line'"
@04080    Variable_C    .

Commands like regexm and tokenize or functions like subinstr also see "." instead of "$CHAR6.", making it impossible (for me at least) to retrieve all the relevant information in these lines. 

Any help appreciated,

Benoit-Paul Hebert
Recherche en politiques / Policy Research
RHDCC/HRSDC

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: Problem parsing strings that contain "$CHAR"
  - From: "Nick Cox" <[email protected]>

Prev by Date: st: -return list- with -ineqdeco-
Next by Date: st: RE: -return list- with -ineqdeco-
Previous by thread: st: -return list- with -ineqdeco-
Next by thread: st: RE: Problem parsing strings that contain "$CHAR"
Index(es):
- Date
- Thread