[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: strange world

From   Ronan Conroy <[email protected]>
To   [email protected]
Subject   st: Re: strange world
Date   Wed, 9 Jan 2008 11:15:11 +0000

There's another consideration too. Logical operators are often found in complex expressions. While sometimes you have to guard against missing values, some expressions depend on all variable being nonmissing, while others do not.


. gen underweight = (bmi1 < 19 ) | (bmi2 < 19) | (bmi3 < 19)
. lab var underweight "At least one body mass index below 19"

The variable thus defined can be calculated even when one or two of the bmi variables are missing. If that's fine by you, then Stata should not stand in your way.

The user might specify

. egen bmi_missing = rowmiss(bmi1 bmi2 bmi3)
. gen underweight = (bmi1 < 19 ) | (bmi2 < 19) | (bmi3 < 19) if bmi_missing < 2

which would allow the expression to be evaluated if there were at least two BMI measurements. But the choice of how many missing measurements to tolerate has to be a scientific one.

For this reason, I think that the user is the only person who knows under what circumstances a logical expression should evaluate to missing. It's unfortunate that Stata, SAS and SPlus/R have different ways of handling missing data in logical expressions, but I don't think that switching to the S philosophy that x < NA evaluates to NA is going to be any easier.

A happy new year, belatedly, to one and all!


* For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index