[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: Beware of Stata's Syntax
"Nick Cox" <firstname.lastname@example.org>
RE: st: Beware of Stata's Syntax
Fri, 23 May 2008 15:24:41 +0100
Sergiy has put his finger on a quirky consequence of -syntax-, but I
don't agree with his conclusion.
With -syntax- being able to say -varname- when you mean -varlist-, but
just one variable, is a convenience that many programmers are happy to
Being able to say -[varlist]- when a varlist is to be filled in if not
given is key when that is what you want.
Putting the two together is what has the quirky consequence, as
is going to be an appropriate syntax if, and only if, there is just one
variable in the dataset and you want specification of that variable to
be optional by the user. That is indeed an unlikely situation, but there
is nothing either illogical or impossible about it. It's easy to imagine
situations in which you could have just one variable, say a list of
addresses or a list of replicate measurements, and the syntax is then
As Sergiy points out, if you have two more variables, and don't specify
any, then you get bitten. The syntax is then contradictory, as the [ ]
oblige Stata to fill in the varlist with your existing variables, but
the -varname- obliges Stata to complain that more than one variable has
There are various possible reactions to this. One is "Don't do that
then!", i.e. that the programmer should not use a -syntax- that is not
appropriate for the problem, or that the user is misusing or
misunderstanding the purpose of the program. (Depending on
circumstances, that could be the fault of the user, or of the programmer
in not writing clear enough documentation.)
Another is Sergiy's reaction, which is that -syntax- should not allow a
syntax like this, as it will usually bite you. I don't follow this at
all. -syntax- has one and only one job, to act as a go-between between
the programmer's program and the user's inputs. Rewriting it so that it
also starts filtering things that do not seem sensible or a good idea,
or that won't work in some circumstances, is undesirable in principle
and quite unworkable in practice. There must be a pretty large fraction
of legal syntaxes that won't work in practice in some circumstances, but
it isn't -syntax-'s job to reject those syntaxes on those grounds.
Changing "some circumstances" to "most" or even "almost all" does not
affect the main problem and would create other difficulties, as
StataCorp's programmers would then be obliged to make innumerable
judgment calls on marginal cases. Besides, any language like Stata lends
itself continually to serendipitous discoveries or inventions of novel
problems to solve or ways of doing things that were not foreseen by the
developers. It's programmers like Sergiy who do that!
In this particular case, suppose you want precisely this
syntax of [varname]. Then to be told, in effect, that your syntax is not
a good idea would be very frustrating (although there would always be
other ways of implementing it with lower level commands).
Ferreting out logical inconsistencies in Stata, when they exist, is a
public service but
I can't see this as a logical error or even a substantial practical
[mailto:email@example.com] On Behalf Of Sergiy
Sent: 22 May 2008 23:41
Subject: Re: st: Beware of Stata's Syntax
Thank you Austin.It follows that by default -syntax- will add an
additional option (varlist=* max=1) which renders -syntax- inoperable.
This makes it a poor choice for defaults, since most users probably
work with datasets with more than one variable. And it makes even less
sense when "varname" is in [ ].
As for the documentation , I thought when I type
StataBug x y z, option1(value1)
this would be exactly the line that is going to the parser, and after
that x y and z are put to the -varlist-.
So that if I type
and StataBug contains
syntax [varname], option1(string)
Stata converts the string to be parsed into:
"[varlist min=0 max=1], option1(string)"
substitutes from what I have typed:
"[min=0 max=1], option1("value1")"
evaluates, must result in OK, because 0 is between 0 and 1
then fills in varlist with the variable names (if specified) or all
variable names (if not specified) and let's the program go.
Adding * to varlist and adding (max=1) should be mutually exclusive,
since together they will fail most of the time.
On 5/22/08, Austin Nichols <firstname.lastname@example.org> wrote:
> Sergiy Radyakin <email@example.com> et al. --
> Beware of posts with the word "bug" as they usually describe a
> misunderstood feature.
> help syntax##description_of_varlist
> documents that
> The default is to fill [varlist] in with all the variables. If
> default=none is specified, it is left empty.
> Typing varname
> is equivalent to typing varlist(max=1).
> using either
> syntax [varname (default=none)]
> syntax varname
> is safest.
> On Thu, May 22, 2008 at 5:17 PM, Sergiy Radyakin
> > Dear All,
> > I wonder if this is an intended behaviour (quite dangerous from my
> > point of view) or just a bug?
> > // --- Begin of file SyntaxBug.do ---
> > program drop _all
> > drop _all
> > generate VariableThatShouldNotBeUsed=.
> > program define SyntaxBug
> > syntax [varname]
> > di `"`varlist'"'
> > end
> > SyntaxBug
> > generate JustAnotherVariable=.
> > SyntaxBug
> > // --- End of file SyntaxBug.do ---
> > If this is not an intended behaviour then it is actually two bugs in
> > one, since under some conditions the program will process the
> > that the user DID NOT specify (first call in the program above), and
> > under other conditions it will refuse to work though it supposedly
> > should (second call in the program above).
> > If this is intended (may be there is an explanation) then it goes
> > against the documentation, saying anything in [ ] is optional.
* For searches and help try: