Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Poststratification weighting, subpop, and missing values

From   Stas Kolenikov <>
Subject   Re: st: Poststratification weighting, subpop, and missing values
Date   Thu, 27 Sep 2012 09:44:43 -0500

On Thu, Sep 27, 2012 at 9:03 AM, Steve Samuels <> wrote:
>> 3. I use the clause "if !missing(y)" above, rather than "if y ~=.", because
>> the latter would not capture missing values like ".a".
> This seemed like a slick idea at 5:00 am, but Nick Cox privately reminded me of
> a far better one to accomplish the same thing:
> "Tony Lachenbruch pointed out in 1992 that -if y < .- saves a character
> on -if y != .- or -if y ~= .- and the tip gained extra force when .a
> ... .z were introduced."

With all due respect, I would never use a short cut that deals with
this ordering of missing values. First, I personally do not keep track
of that in my head (that is to say, I don't know Stata that well,
after all). Second, this is something programmers refer to as "strong
coupling", as it relies on the knowledge of highly technical details
of internal implementation of the missing values that may or may not
be legal to use and stable in different versions. There's arguably
legacy code that does use this ordering inside the official Stata
code, so Stata Corp is unlikely to change that, but there is no
guarantee that in Stata 45 it will still be that way. From this
stability and independence of implementation perspective, the function
-missing()- is provided exactly for the reason of being able to
correctly deal with whatever the system of missing values is in that
Stata 45 version. Third, strong coupling also means that if somebody
else with little knowledge of Stata were to use that code, and port it
into say R or Python or whichever other software that has a different
system of missing values (say, the missing value is MINUS infinity,
not the PLUS infinity, or is not-a-number and cannot be compared to a
number), then this snippet of code will produce errors at best, and
totally wrong results at worst.

Fourth, I type fast enough to put -if !missing(y)- in my do-files in
about as much time as it would take me to type -if y < .-.

-- Stas Kolenikov, PhD, PStat (SSC)  ::
-- Senior Survey Statistician, Abt SRBI  ::  work email kolenikovs at
srbi dot com
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index