Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Poststratification weighting, subpop, and missing values

From   Nick Cox <>
Subject   Re: st: Poststratification weighting, subpop, and missing values
Date   Thu, 27 Sep 2012 16:03:53 +0100

I take your point, and agree that this kind of technicality can be an
issue for people who may have to manage and/or use code for all kinds
of different software. I am very happy to commend -missing()- and
-!missing()- too and have done so in print.

But the short-cut is not compulsory. It is for those who understand it
and like it. It's fair advice not to use something you won't remember
or don't understand.

Specifically, the idea that missing values count higher than any
non-missing value is built in into how Stata -sort-s data and I am
confident that it's here to stay. That's not a matter of internal
implementation, but quite explicit in Stata's behaviour.

Where do you draw the line any way? For example, Stata evaluates true
or false as 1 or 0 (e.g. (2 == 2) as 1 and (3 == 2) as 0) and many
Stata programmers use that sort of evaluation routinely. I would not
be put off in the slightest by remembering that there are languages in
which "true" is -1 not 1.


P.S. I don't think the Stata users on Stack Overflow who rely on your
support would agree that you "don't know Stata that well, after all".

On Thu, Sep 27, 2012 at 3:44 PM, Stas Kolenikov <> wrote:
> On Thu, Sep 27, 2012 at 9:03 AM, Steve Samuels <> wrote:
>>> 3. I use the clause "if !missing(y)" above, rather than "if y ~=.", because
>>> the latter would not capture missing values like ".a".
>> This seemed like a slick idea at 5:00 am, but Nick Cox privately reminded me of
>> a far better one to accomplish the same thing:
>> "Tony Lachenbruch pointed out in 1992 that -if y < .- saves a character
>> on -if y != .- or -if y ~= .- and the tip gained extra force when .a
>> ... .z were introduced."
> With all due respect, I would never use a short cut that deals with
> this ordering of missing values. First, I personally do not keep track
> of that in my head (that is to say, I don't know Stata that well,
> after all). Second, this is something programmers refer to as "strong
> coupling", as it relies on the knowledge of highly technical details
> of internal implementation of the missing values that may or may not
> be legal to use and stable in different versions. There's arguably
> legacy code that does use this ordering inside the official Stata
> code, so Stata Corp is unlikely to change that, but there is no
> guarantee that in Stata 45 it will still be that way. From this
> stability and independence of implementation perspective, the function
> -missing()- is provided exactly for the reason of being able to
> correctly deal with whatever the system of missing values is in that
> Stata 45 version. Third, strong coupling also means that if somebody
> else with little knowledge of Stata were to use that code, and port it
> into say R or Python or whichever other software that has a different
> system of missing values (say, the missing value is MINUS infinity,
> not the PLUS infinity, or is not-a-number and cannot be compared to a
> number), then this snippet of code will produce errors at best, and
> totally wrong results at worst.
> Fourth, I type fast enough to put -if !missing(y)- in my do-files in
> about as much time as it would take me to type -if y < .-.
*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index