Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: Missing as true - Was: Re: st: RE: Another Stata feature


From   Allan Reese <[email protected]>
To   Stata Listserve <[email protected]>
Subject   Re: Missing as true - Was: Re: st: RE: Another Stata feature
Date   Thu, 8 Jan 2004 10:42:26 +0000 (GMT)

On Wed, 7 Jan 2004, Bill Rising wrote [with RAR's inserts]:
> ..., it would make Stata
> code easier to read and less prone to error if people could code the
> [potentially RAR] incorrect
>
> regress foo bar if snafu
>
> instead of the [intended? RAR] correct
>
> regress foo bar if snafu & snafu < .
>
> for snafu being some sort of indicator which could be missing.
>
> I've used Stata long enough that the latter comes natural to me. Still,
> I'd hate to see how many analyses have been found invalid because of
> folks forgetting the extra 'less than missing' clause.

My point exactly.  It *is* documented, thus making it a feature, but who
reads documentation?  It is *known* to all members of this list? to all
Stata users?  I doubt it.  It raises anomalies, as here:

. gen m = var1>0
. gen l = var1<0
. list var1 m l
     | var1   m   l |
  1. |    1   1   0 |
  2. |    2   1   0 |
  3. |    3   1   0 |
  4. |    .   1   0 |
  5. |   -1   0   1 |
  6. |    0   0   0 |

Within the Stata language, "missing" is a positive number, but that is not
a natural treatment of missing data.  In the same way that "replace" by
default reports "n values changed", I suggest it would be more sporting to
report "missing values used in calculation - check answers".

Since Nick insists I spell out the joke (?), we were told that the basis
for invading Iraq was that wmd was definitely TRUE.  It subsequently turns
out that the data were incomplete or inconclusive.  But if wmd>0 computes
as TRUE for missing data, they can justify any political or management
decision.

I have had similar exchanges on the discussion list devoted to spreadsheet
use.  The techies say, "It's a documented feature, so everyone knows", and
the managers say, "We got the answer from the computer, so it must be
correct."  There is a wonderful area of computer science devoted to
*proving* programs are correct; I've never seen evidence of an automated
procedure that it capable of checking that the correct variable was named
in an expression or that the correct operator was used.

History demonstrates that it is only after a sequence of disasters that
"management" accept that systems should be self-checking and error
avoiding.  Relying on people to "do the right thing" in all circumstances
is a proven recipe for disasters.  WRT software, why can't we abridge the
historic process?

R. Allan Reese                       Email:     [email protected]

====================================================================
It was reported last week that a passenger on the aircraft that crashed
shortly after take-off was phoning a friend and said that something was
wrong.  Presumably, the passenger also told the friend that the crew
had just announced that all mobiles must be switched off as they might
interfere with the aircraft's systems.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index