[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Going Loopy!! |

Date |
Mon, 7 Apr 2003 15:29:19 +0100 |

Joel Clovis > can it be the !missing operator ignors "." for missing observations? missing(.) is 1 (true), while missing(".") is 0 (false). Here as wider context is an extract from an article in the Stata Journal 2(3):314--329 (2002). It is out-of-date in that it does not mention extended missing values .a ... .z. Numeric missing in Stata is represented by a period ., which always is treated as larger than any other numeric value. Thus a -sort- of a numeric variable sorts missing values to the end of a data set. String missing in Stata almost always means an empty or null string, "". An empty string contains nothing (or does not contain anything, depending on your metaphysical predilections), whatever the type of string variable concerned. A -sort- of a string variable sorts empty strings to the beginning of a data set. No special meaning is given by Stata to strings consisting of one or more spaces. If you want such strings to be treated as missing, consider using -replace- with the -trim()- function to reduce them to empty strings. Occasionally, some commands treat ".", or even that together with any leading or trailing spaces, as indicating missing. This is anomalous and deserves brief comment. -destring- is a case in point, and in this instance the anomaly can be justified. ... -destring- is for situations in which a variable should be numeric, but is by mistake string. For example, suppose you typed a column of data into Stata's data editor, but by mistake typed a non-numeric character in the value for the first observation. (You may have been thinking of a header line, spreadsheet style.) Thereafter, you typed numeric characters, including . for missing values. The result of all this is that Stata interprets the column as a string variable, but that is almost certainly wrong. -destring- feels free to interpret the string ".", or any string which {\tt trim()} reduces to ".", as really numeric missing. It is, not surprisingly, much more circumspect about other non-numeric characters. Another exception is -compare-, introduced into official Stata 3.0 in March 1992. As explained at [R] compare, this command, in deference to some users' habits, understands "." as well as empty strings as indicating string missing. With perfect hindsight, this broad-mindedness was perhaps a mistake, but it does very little harm and is better left unchanged, just in case a change breaks someone's long-standing do files or programs. Finally, note that the -string()- function ... yields ".", not "", as the string equivalent of numeric missing. Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Going Loopy!!***From:*"Joel Clovis" <joel.clovis@stud.man.ac.uk>

- Prev by Date:
**st: Going Loopy!!** - Next by Date:
**Re: st: xtlogit** - Previous by thread:
**st: Going Loopy!!** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |