Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: re: getting the quotes right


From   Finne H�kon <[email protected]>
To   [email protected]
Subject   st: re: getting the quotes right
Date   Fri, 28 Jun 2002 20:00:40 +0200

Thanks to Zhiqiang Wang, Bill Gould, Thomas Steichen, Nick Cox, and Nick
Winter for contributing to getting the quotes right. (Quite a roster, that!)
The advice did not converge but my problem is solved, so here is an attempt
at a synthesis. I made some surprising discoveries in the process.

My original faulty construct (I have added a variable in the -list- command
for improved troubleshooting):

* H�kon
local buicks (make=="Buick Century" | make=="Buick Electra" | make=="Buick
Opel")
local datsuns (make=="Datsun 200" | make=="Datsun 210")
local vws (make=="VW Dasher" | make=="VW Rabbit" | make=="VW Scirocco")
local brands `buicks' `datsuns' `vws'
foreach brand of local brands {
	list make price if `brand'
	}

I failed to tell you but the error message given was

too few quotes
r(132);

Zhiqiang suggested to use compound double quotes in each case of lines 1-3,
e.g. for line 2:

local datsuns (make==`"Datsun 200"' | make==`"Datsun 210"')

etc. It turns out that this is not necessary in any of the solutions because
it concerns the innermost level of nesting double quotes, but they do no
harm. I continue without them to facilitate legibility.

Zhiqiang then crucially suggested the use of nested single quotes for macro
substitution in line 6. Nick W did the same. They both also changed the
logic of my code, so I cannot use their code directly. Zhiqiang's suggestion
translates as the following amendments for my code (lines 4 and 6):

local brands buicks datsuns vws

	list make price if ``brand''

Note that the single qoutes are gone from line 4 and that they are nested in
line 6. This means that the macro named
	brands
in line 4 gets the contents
	buicks datsuns vws
instead of the contents
	(make=="Buick Century" | make=="Buick Electra" | make=="Buick Opel")
(make=="Datsun 200" | make=="Datsun 210")(make=="VW Dasher" | make=="VW
Rabbit" | make=="VW Scirocco")
so that in line 6, the -list- command first meets the construct
	... if ``brand''
which is substituted from the inside. First, Stata looks for the macro named
	brand
which for the nth loop of -foreach- is the nth element of the macro named
	brands
The first, second and third elements are the strings
	buicks
	datsuns
	vws
Assuming we are in the second loop, the construct
	... if ``brand''
is now transformed to
	... if `datsuns'
which easily transforms again through macro substitution to
	... if (make=="Datsun 200" | make=="Datsun 210")
so the code below works withour error:

* Zhiqiang-NickW
local buicks (make=="Buick Century" | make=="Buick Electra" | make=="Buick
Opel")
local datsuns (make=="Datsun 200" | make=="Datsun 210")
local vws (make=="VW Dasher" | make=="VW Rabbit" | make=="VW Scirocco")
local brands buicks datsuns vws
foreach brand of local brands {
	list make price if ``brand''
	}

and the output is as hoped for:
     make                   price
  4. Buick Century          4,816
  5. Buick Electra          7,827
  7. Buick Opel             4,453

     make                   price
 56. Datsun 200             6,229
 57. Datsun 210             4,589

     make                   price
 70. VW Dasher              7,140
 72. VW Rabbit              4,697
 73. VW Scirocco            6,850


Bill then very logically explained the use of compound double quotes. As he
said,
"because -foreach- parses on spaces and binds on quotes", his concrete
solution boils down to an amended line 4:

local brands `"`buicks'"' `"`datsuns'"' `"`vws'"'

which amounts to using compound double quotes for quoting the three macros
as a way of later signalling to -foreach- that the macro named brands has
three distinct string elements, each bound by quotes and separated by
spaces. I found it all very logical and I tested it but got the error
message from line 6

too few quotes
r(132);

which was all too familiar.

Tom also tested Bill's variant but with some modifications. First, he added
compound double qoutes on the outside of each of the total strings of lines
1-3. For example, line 2:

local datsuns `"(make=="Datsun 200" | make=="Datsun 210")"'

This is intended to set the three strings from lines 1-3 apart from each
other when they are concatenated but I tested it and it makes no difference
to what follows. Tom also used the following variation to line 5:

foreach brand of local `"`brands'"' {

and got the error message from line 5

local macro name "(make=="Buick Century" | make=="Buick Electra" |
make=="Buick Opel") too long 
r(198);

Nick C helped identify this error correctly as relating to the length of the
_name_ of a macro, and that it stems from letting -foreach- operate on a
macro named `brands' or `"`brands'"' (which both translate to the string in
the error message) instead of on a macro named brands . But at least this
error message contains some useful information. What it tells us is that
-foreach- has found the first element of the string
	`"`buicks'"' `"`datsuns'"' `"`vws'"'
to be
	"(make=="Buick Century" | make=="Buick Electra" | make=="Buick
Opel")
so we can conclude that -foreach- has parsed roughly as we intended it: the
first element ends where it should but it contains a regular double quote at
the beginning that shouldn't be there. Where does it come from? I couldn't
figure that one out; I don't understand the rules of inside-outside /
left-right for the parser well enough. It might be this extraneous " that
also invokes the error message "too few quotes" when running my test of
Bill's code. But Bill's logic seems impeccable so there is probably
something in the parsing rules that does this. In particular, look at what
happens to Bill's secondary suggestion, which was to get rid of the macro
named brands (drop line 4) and go straight to another syntax of -foreach-
for line 5:

foreach brand in `"`buicks'"' `"`datsuns'"' `"`vws'"' {

With this amendment, which requires simpler macro substitutions, Bill's code
works! Here it is:

* Bill
local buicks (make=="Buick Century" | make=="Buick Electra" | make=="Buick
Opel")
local datsuns (make=="Datsun 200" | make=="Datsun 210")
local vws (make=="VW Dasher" | make=="VW Rabbit" | make=="VW Scirocco")
foreach brand in `"`buicks'"' `"`datsuns'"' `"`vws'"' {
	list make price if `brand'
	}


Nick W suggested using -inlist- rather than the lengthy strings. With
minimum changes from my now working * Zhiqiang-NickW code it looks like this
(and works correctly):

* NickW
local buicks `""Buick Century","Buick Electra","Buick Opel""'
local datsuns `""Datsun 200","Datsun 210""'
local vws `""VW Dasher","VW Rabbit","VW Scirocco""'
local brands buicks datsuns vws
foreach brand of local brands {
	list make price if inlist(make,``brand'')
	}

I had to bind (as Nick W also suggested) each of the full length strings in
lines 1, 2 and 3 with external compound quotes in order to avoid the error
message:

invalid syntax
r(198);

I did not, however, have to follow his suggestion

local brands "buicks datsuns vws"

even though that, too, worked: -local- read the 19 characters (including
spaces) as a string anyway.

Incidentally, the * NickW code works with

foreach brand in local buicks datsuns vws

and the * Bill code works with -inlist- (provided the three first macros are
defined as in the * NickW case).

So it appears that a string representing a comma-separated list in -inlist-
has to be better bounded than a string representing the logical condition
for -if- .

In general, the rules of quotes are clear:
`...' delimit local macros
``...'' delimit local macros whose names are local macros themselves
"..." delimit strings
`"..."' is a safer way of delimiting strings because these delimiters are
nestable.

Despite this clarity of rules, the applications are often difficult because
the way quotes are treated by Stata's parser depends on what kind of syntax
element it expects to find where the quotes are placed. It also adds to the
complexity that the quotes are not necessarily directly visible in the line
of code where the parser meets them because of substitution. It is, for
example, very difficult to use -display- as a debugging tool for the
placement of single, double, and compound quotes and the evaluation of
macros because -display- generally looks variables to be evaluated or
strings whereas the thing you want to display for debugging may be the names
of variables after macro substitution, etc. Look also at the extra "
produced by the logically derived * Bill code: I am sure its source can be
discovered, but it is not easy (or else Bill's code wouldn't have been
logical). Put another way: when the parser meets a string of characters, is
sometimes sees a string value and sometimes a variable name (there are other
possibilities as well).

And even if this hypothesis of mine is wrong, I think that the various
contributions to this thread indicate that at least sometimes, it is not
only a beginner's problem getting the quotes right.

-- H�kon
[email protected]
(hoping I got my quotes right now)
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index