# RE: st: question on cond( )

 From "Nick Cox" To Subject RE: st: question on cond( ) Date Thu, 28 Feb 2008 18:02:19 -0000

```When David Kantor and I wrote a tutorial on -cond()- in

SJ-5-3  pr0016  . . Depending on conditions: a tutorial on the cond()
function
. . . . . . . . . . . . . . . . . . . . . . .  D. Kantor and N.
J. Cox
Q3/05   SJ 5(3):413--420                                 (no
commands)
tutorial on the cond() function

we didn't even get to the four argument case. I agree with Nick Winter
that the on-line
help for the four argument case looks wrong.

I think what Paul wants is well (if not best) coded like this:

gen z = cond(missing(x), ., x > 5)

This has all of a sudden come to be my favourite way to code creation of

dummy/dichotomous/binary/logical/quantal/Boolean variables that could
be 1, 0 or missing. (Any other synonyms?)

It's perhaps simpler at first sight to code like this

gen z = x > 5 if x < .

in which the mapping of missings to missings is tacit, that is, if x is
missing Stata does not use
the result of (x > 5) but assigns missings.

But then if you have some more complicated definition involving two or
more variables you have
to trap all the problems on all the variables:

gen z = x > y if x < . & y < .

This could be

gen z = x > y if !missing(x, y)

but as said I like to turn it round

gen z = cond(missing(x, y), ., (x > y))

That way it's explicit what happens with missings. And it's quite easy
to put in words:

If there are missings on any x or y, return missing; otherwise evaluate
(x > y).

Yet more variables can be packed into the -missing()-:

gen z = cond(missing(x, y, a, b), ., (x > y) & (a == b))

In all the above, -gen byte z- rather than -gen z- is careful on
storage.

Nick
n.j.cox@durham.ac.uk

Nick Winter

It looks to me like the examples in the help for cond() are either

The function cond(condition,a,b,c)

returns -a- if -condition- is true; -b- if -condition- is false, and -c-

if -condition- is missing.

Note that last is "the *condition* is missing"; that is, that the
statement evaluates to missing.  This is *not* the same as some part of
-condition- evaluating to missing.

So in the example where condition is "x>2", this condition evaluates to
either true or false for all observations, including cases where x=.,
because the condition ".>2" is true under Stata's handling of missing
values.

This seems to make the following statement from the help file wrong:
"cond(a>2,"this","that","missing") = "missing" if a > ."

The only way I can think of to trigger the "missing" option would be
something like this:

clear
set obs 10
gen x=_n-1 in 1/8
gen z=cond(x,"true","false","missing")
list

+-------------+
| x         z |
|-------------|
1. | 0     false |
2. | 1      true |
3. | 2      true |
4. | 3      true |
5. | 4      true |
|-------------|
6. | 5      true |
7. | 6      true |
8. | 7      true |
9. | .   missing |
10. | .   missing |
+-------------+

But once you are doing a comparison (x>2), that will always evaluate to
either "true" or "false" in Stata; never to missing.

Visintainer, Paul

> I'm not sure why the "condition" function is not coding z with 2
missing
> values.  If I'm reading the functions command correctly, z should be
> coded as missing:
>
> cond(a>2,"this","that","missing") = "missing" if a > .
> cond(a>2,"this","that","missing") = "this" if a > 2 and a < .
>
> Any ideas?
>
> Thanks.
>
> . gen z=cond(x>5,1,0,.)
>
> . list
>
>      +-------+
>      | x   z |
>      |-------|
>   1. | 1   0 |
>   2. | 2   0 |
>   3. | 3   0 |
>   4. | 4   0 |
>   5. | 5   0 |
>      |-------|
>   6. | 6   1 |
>   7. | 7   1 |
>   8. | 8   1 |
>   9. | .   1 |
>  10. | .   1 |
>      +-------+

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```