Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Going Loopy!!


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Going Loopy!!
Date   Mon, 7 Apr 2003 15:29:19 +0100

Joel Clovis

> can it be the !missing operator ignors "." for missing observations?

missing(.) is 1 (true), while missing(".") is 0 (false).

Here as wider context is an extract from an article in the Stata
Journal 2(3):314--329 (2002). It is out-of-date in that it does not
mention extended missing values .a ... .z.

Numeric missing in Stata is represented by a period ., which always is
treated as larger than any other numeric value.
Thus a -sort- of a numeric variable sorts missing values to the end of
a data set.

String missing in Stata almost always means an empty or null string,
"". An empty string contains nothing (or does not contain anything,
depending on your metaphysical predilections), whatever the type of
string variable concerned. A -sort- of a string variable sorts empty
strings to the beginning of a data set.

No special meaning is given by Stata to strings consisting of one or
more
spaces. If you want such strings to be treated as missing, consider
using
-replace- with the -trim()- function to reduce them to empty strings.

Occasionally, some commands treat ".", or even that together with
any leading or trailing spaces, as indicating missing. This is
anomalous
and deserves brief comment.

-destring- is a case in point, and in this instance the anomaly can be
justified. ... -destring- is for situations in which a variable should
be
numeric, but is by mistake string. For example, suppose you typed a
column
of data into Stata's data editor, but by mistake typed a non-numeric
character in the value for the first observation. (You may have been
thinking of a header line, spreadsheet style.)
Thereafter, you typed numeric characters, including . for missing
values.
The result of all this is that Stata interprets the column as a string
variable, but that is almost certainly wrong. -destring- feels free to
interpret the string ".", or any string which {\tt trim()} reduces to
".", as really numeric missing. It is, not surprisingly, much more
circumspect about other non-numeric characters.

Another exception is -compare-, introduced into official Stata 3.0
in March 1992. As explained at [R] compare, this
command, in deference to some users' habits,  understands "." as well
as
empty strings as indicating string missing. With perfect hindsight,
this
broad-mindedness was perhaps a mistake, but it does very little harm
and is
better left unchanged, just in case a change breaks someone's
long-standing
do files or programs.

Finally, note that the -string()- function ... yields ".", not "", as
the string equivalent of numeric missing.

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index