[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
AW: st: AW: maximum number of characters in do file
"Martin Weiss" <email@example.com>
AW: st: AW: maximum number of characters in do file
Thu, 26 Mar 2009 16:39:40 +0100
" And sometimes a longer message would just seem patronising and not really
The only time that Stata deviates from this laudable practice is in the
warning triggered by non-estimation commands being fed to -bs-:
sysuse auto, clear
bs r(mean): su pr
[mailto:firstname.lastname@example.org] Im Auftrag von Nick Cox
Gesendet: Donnerstag, 26. März 2009 16:29
Betreff: RE: st: AW: maximum number of characters in do file
There's lots here to comment on.
An old joke, which I know from Woody Allen, has two people grumbling:
"The food here is terrible!"
"And such small portions!"
I formalise this as a
Metacomplaint: The trouble with complaints is that they are so often
One gloss: Individual users sometimes ask for things they probably don't
want as much as they think they do.
Another gloss: Collectively, users may ask for quite different things, and
then what is StataCorp to do?
>From the top, let's subdivide:
1. Error messages are issued by Stata when, from its point of view, you get
something wrong. Traditionally, they are terse.
Everyone on this list has no doubt been frustrated at least some of the time
by terse error messages from Stata. But the positive side is that this is
really complimentary: Stata is assuming that you are smart enough to figure
it out. And sometimes a longer message would just seem patronising and not
really helpful. Imagine that "syntax error" were replaced by
"I'm sorry, but I can't figure out what you mean by this code. There is an
error in it somewhere. You should look at it again or ask any Stata expert
you know. Or you could email Statalist or email@example.com for help."
How quickly would that get irritating when repeated several times in a
session! Or thousands and thousands of times in your Stata career!
Really, terse is good.
2. Warnings are sometimes given by Stata. I don't know what the policy is on
when warnings are given and when not. I guess there's splendid adhockery. It
becomes obvious that people will or do often misunderstand a specific
problem, and warnings are added.
3. Advice. Sergiy wants Stata to go further at least some of the time and,
in a nutshell, give advice.
Abstractly, who can disagree? Who hasn't been at a loss to know what to do
in some Stata problem?
Concretely, let us think of how this would work in practice. Consider the
specific problem of a .do file that won't fit in the Do-file Editor which
started this thread.
First off, I don't well understand what people don't understand about the
error message here. If the file won't fit, you need a smaller file or to
handle it with software that can cope. That's not, in the old cliché, rocket
science. (Rocket science is reserved by statute on this list for Al Feiveson
of NASA.) But there is, as so often, a selection problem. All the people who
realise instantly that this is the answer don't trouble Statalist with
queries on the matter.
But the point, optimistically, is perhaps that some people do well
understand the error message; they just want advice on what to do instead.
If a graduate student of mine came to me with this error message I would
(gently) ask them how on Earth they got to be working with such big do files
and tell them not to do it. Sergiy gives one kind of reason why do files may
be big: they may include lots of data. Another is that people are making
lots of specific small edits to the data and missing efficient ways to do
that. Then I would explain about using text editors and give one or two
Naturally, other people might well have different advice.
What advice is Stata going to give then?
"You are working with a very big do-file. That's not a good idea. Split it
up. Put any data in one or more separate data files."
In all seriousness, I think that's the best concise advice that can be given
in this situation. But it's not hard to imagine that many users would regard
such advice as patronising, irritating, offensive, irrelevant, etc., even
with moderate rewording.
Sergiy's own suggestion is
"Stata program itself can execute longer .do files, and it is only a
limitation of the do-files editor, this limitation also applies to the older
version of the Notepad.exe (standard Windows text editor) but other editors
don't have this limitations and can be used to edit do files externally
without loss of functionality".
Setting aside the delicate Windows issue here (why should anyone care about
the limitations of Notepad, especially Macintosh or Linux users?) Sergiy is,
as anyone else would have to do, assuming certain knowledge when he writes
this. The advice is going to be just right for some users, cryptic for
people who don't know what text editors are, and pointless for many savvy
users who know all this, and much more.
In a nutshell: Advice is best given person to person in a conversation that
establishes what the user knows and wants. It is not best given by a program
that does not know what the user knows or wants.
Suppose your R-square is very low. Should Stata give the advice that you
have a lousy model and should think again? If not, then where do you draw
the line between situations calling for advice and those not?
Incidentally, one persistent hallmark of Sergiy's posts on the list is his
frustration with whatever is not documented by StataCorp -- tempered by his
occasional delight in having unearthed some such thing, and (less
frequently) by his explanation to everyone else of what he has found.
I've often thought that what he wants is R, in which everything is open, at
the price of knowing where to find it and of reading the code. But, no, his
model of programming is -- Visual Studio, in which Microsoft reveals itself
as a paragon of responding to user suggestions. All I can say is that
doesn't tally with my experience with Microsoft software.
More seriously, I emphatically would not suggest that the way to find out
about Stata's network protocols is to Google for that. The best way to find
out about something deep down and technical and undocumented in the manuals
is to ask via tech-support.
may I ask what is the reason to put so much info in the .do file in
the first place?
If this is data, and it is coming from Excel, why not save it as an
ASCII file and merge to your dataset?
Mixing code and data is not always a good idea for several reasons,
which may or may not apply in your case:
1. portability of code (what would a researcher do if she needs to
replicate your computations in SPSS? retype the whole program? or just
one command which imports/merges data?)
2. changes in data (what happens if the care provider's data needs to
be actualized some time later? changing program has a risk of
introducing mistakes, etc)
3. program is hard to read and understand (and particularly to print
if that matters) if the flow of code is interrupted by data generating
To be fare, sometimes one does want the program and data to be in the
single file, when there is a risk of separation between them, which
can make both data and code useless, or when it is known that none
will be modified later. A single file is generally easier to
While there is a significant bias in this list to create FAQ bullets
from questions like this, and directing the answer-seeking researchers
to them every couple of weeks, the problem can be solved by simply
modifying the error message to be helpful, which is not to say
"something is impossible to do", but "this is the way how you should
do the thing you want". E.g. in this case the error message could have
contained the text saying that "Stata program itself can execute
longer .do files, and it is only a limitation of the do-files editor,
this limitation also applies to the older version of the Notepad.exe
(standard Windows text editor) but other editors don't have this
limitations and can be used to edit do files externally without loss
Perhaps the most radical feature of Visual Studio where I have
recently switched for my programming is that users can easily
contribute into the help shown for any procedure, error message,
visual component or whatever topic. Moreover it is not the user's
responsibility to submit their contributions to Microsoft, since
external sites and blogs are indexed as well, and show up in the
search while Microsoft keeps the authority for maintaining the
official documentation. I have no idea how it is implemented, but it
Would be nice to have Stata provide a possibility for the community to
maintain a "commented version" of the official help, again, explaining
"what to do next? how do I make it work?", rather than "what went
E.g. suppose you get an error 671 "unknown network protocol", what
would be the next step, besides calling the tech-support?
[if you answered "Google it!" or "GOOGLE IT!!!" try to google it and
see if anything beyond this page turns up in the results].
PS: not to hijack the thread, but out of pure curiosity: if some
experienced Google-searchers can actually find the list of what Stata
considers "known protocols", I would appreciate getting a link to it.
* For searches and help try:
* For searches and help try: