Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: error mesage convention [was: Re: st: Importing data with infile: Identifying records with problems]


From   "Lachenbruch, Peter" <Peter.Lachenbruch@oregonstate.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: error mesage convention [was: Re: st: Importing data with infile: Identifying records with problems]
Date   Thu, 3 Sep 2009 09:02:00 -0700

When I am creating a do-file or an ado-file, I always have errors.  I have found that judicious use of display commands (disp "got to line 20")  tells me at least where there isn't an error.  This is probably way to old-fashioned, but it works for me.

Tony

Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Sergiy Radyakin
Sent: Wednesday, September 02, 2009 4:33 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: error mesage convention [was: Re: st: Importing data with infile: Identifying records with problems]

On Wed, Sep 2, 2009 at 5:42 PM, Maarten buis<maartenbuis@yahoo.co.uk> wrote:
> --- On Wed, 2/9/09, Sergiy Radyakin wrote:
>> When it comes to errors in .ado programs, it only tells
>> you the error code, but not the line number.
>>
>> At least basic error reporting in relation to the source
>> code is highly desirable and would make debugging .ado
>> files a much more pleasant task, ultimately improving the
>> quality of the code. See, how it was done in TurboPascal
>> in the 1980s
>
> I think that we can safely assume that the people who made
> this decision where very well aware of this type of error
> message and thought very carefully about it.
>
> One reason that I would find convincing is that in the day
> to day use of Stata the fast majority of error messages is
> probably not generated by errors in programs, but by wrong
> inputs by the users. Moreover, the vast majority of users
> are not, and should not be, programmers, and often are
> intimidated by technical looking error messages. The way
> the error messages look now, are often very informative
> for those users.

Hi Maarten, this is exactly the problem, without having the user's input, I
don't know where the problem is, so the only choice is to ask to
send me the FULL trace of the program execution.

Last week I received three trace files. Largest of them was 615,915,398 bytes.
Reportedly there was an error 503 somewhere there. You may guess how much
did it take to figure out where and why it happened? Or do you think it was the
last line because Stata is smart to stop after an error?
[no, there is no such option to stop after an error or after an error
with a particular
error code]
Most users would not even want to bother with zipping that trace file.
"If the program
didn't work, why care?"

The majority of programs at SSC are single ado file and a single help file. And
trace is quite helpful in developing and debugging those. (here a good exercise
would be to look for mistakes in an average SSC file, e.g. here:
view http://fmwww.bc.edu/RePEc/bocode/a/appendfile.ado)

Once it gets after the first 10,000 lines of your code and level 10 of
depth of calls,
things change a lot. A simple flag
"do-not-trace-code-from-base-or-update-directories"
would have saved me about 1.535032 godzillion hours. Same if Stata said
I have conformability error in line 1179 of file whatever.ado.

And I don't see how one number could be intimidating, after all, the
user can copy-paste
the message to the author and ask "can I have this problem fixed?". In
fact it would have
prevented from describing the situation "how did it happen"?

Typically it goes: I opened the dataset, then I typed something in
some variables, I don't
remember what and where, then I built some graphs and printed them, I
have an HP printer,
and if it matters we have changed the drivers last week. We always
update drivers on the
first Monday every month. I just thought it could be the reason.
Photoshop is also working
very slowly these days.... and it goes like that forever. YOU MAY NOT
EXPECT THAT
EVERY USER WILL BE SMART ENOUGH TO UNDERSTAND ___WHAT___ DID SHE
DO WRONG TO GET THE ERROR MESSAGE. If they could, I think it is safe to assume
they wouldn't do it wrong. Then, how can they explain it to the
developer that a certain
situation is not accounted for in the program?

Stata itself (when it crashes) displays not only the error code but
also the line number
and filename where it occured. This is an actual message of Stata crashing:

	Microsoft Visual C++ Runtime Library
	Program: \\??????????\Stata10\wsestata.exe
	File:..\..\w\stedit.c
	Line: 4143
	Expression: m!=NULL
                For information on how your program can cause an
assertion failure, see the Visual C++ documentation on asserts. (Press
Retry to debug the application - JIT must be enabled).

Any developer would consider this being HELPFUL information, instead
of (in Stata's language):
assertion is false
r(9);

Which assertion? Where? Why?


Intimidating is another extreme, something like this (not my image,
but intimidating enough for an example):
http://www.itnstuff.com/wp-content/uploads/2009/07/psod.jpg
where users don't know whether they are sending the trace of their
program or the list of their credit card numbers.

Although Stata can give you that too (actual output):

maxmaxvar=5000 maxlrecl=60000
conslen=165216 macrolen=165200
mtr.n = 20
 0.            02d7b6b8 +     4e24 = 02d804dc  (vl_vr)
 1.      1dc + 02d806b8 +     4e20 = 02d854d8  (mapi)
 2.      1e0 + 02d856b8 +     2712 = 02d87dca  (srt)
 3.       1e + 02d87de8 +    28488 = 02db0270  (lbl)
 4.      b78 + 02db0de8 +    3bd08 = 02decaf0  (fmt)
 5.      2f8 + 02decde8 +    28488 = 02e15270  (varlist)
 6.      b78 + 02e15de8 +     1388 = 02e17170  (typ)
 7.       18 + 02e17188 +     2712 = 02e1989a  (by)
 8.       1e + 02e198b8 +    13890 = 02e2d148  (vl_rcp)
 9.      770 + 02e2d8b8 +    28550 = 02e55e08  (macexp)
10.      ab0 + 02e568b8 +    28560 = 02e7ee18  (console)
11.      aa0 + 02e7f8b8 +    28561 = 02ea7e19  (revbuf)
12.   ba826f + 03a50088 +    a1580 = 03af1608  (achar)
13.      a80 + 03af2088 +    28561 = 03b1a5e9  (keyio)
14.   10469f + 03c1ec88 +     1318 = 03c1ffa0  (ifpgm)
15.       18 + 03c1ffb8 +     1318 = 03c212d0  (wepgm)
16.       18 + 03c212e8 +     4e24 = 03c2610c  (vl_vr)
17.      1dc + 03c262e8 +    13890 = 03c39b78  (vl_rcp)
18.            03c1ec88 +     1318 = 03c1ffa0  (ifpgm)
19.       18 + 03c1ffb8 +     1318 = 03c212d0  (wepgm)
--.   22ed38 + 03e50008
NTRACKS too small
total fixed = 20fa9e (2161310)

nvar 0 nobs 0
mvar 200 mobs 252061 tobs 252061
lrecl 200
 3013b10 memsize
 403c4f0 xtraptr
 7050000 vlblptr
 7050000 cddaptr
 3e50008 begin_of_memory
 3200000 cur_tm
matsize=400
unfree 3c6d8c8 - 7060048 = fcc0d880

With other undocumented commands you can dig deeper.

Best, Sergiy Radyakin


>
> This means that the price that we programmers have to pay
> for the comfort of the vast majority of users is that we
> have to learn how to effectively use other debugging
> tools like -set traced #- and -set trace on-. I think that
> that is a price well worth paying.
>
> -- Maarten
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
> http://www.maartenbuis.nl
> --------------------------
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index