Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: AW: RE: Evaluating a set of conditions


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: AW: RE: Evaluating a set of conditions
Date   Wed, 23 Jun 2010 13:59:44 +0100

Yes, if there are missings it's more complicated than my initial answer
could suggest. 

(a == 1) & (((b == 1) + (c ==1) + (d == 1) + (e == 1)) >= 2) 

would seem to match the possibilities better. 

Nick 
n.j.cox@durham.ac.uk 

Martin Weiss

The result does seem to differ much, though, from the one Thomas
evidently
wants - as expressed by his example:

*************
clear*
set obs 10000
set seed 12345

foreach var of newlist a b c d e{
	gen byte `var'=runiform()<.5 
	replace `var'=. if runiform()<.15
}

//NJC
gen disease_true = a & (b + c + d + e >= 2) /* 
*/  if !missing(a, b, c, d, e) 

//Thomas
egen anytwo = rowtotal(a b c d e), missing
egen missing = rowmiss(a b c d e)
replace anytwo = . if (anytwo==0 & missing>=2 & missing<.)
replace anytwo = . if (anytwo==1 & missing==1)
replace anytwo = . if (anytwo==1 & missing==3)
replace anytwo = . if (missing>=4)
gen disease = 1 if (a==1 & anytwo>=2 & anytwo<.)
replace disease = 0 if (a==1 & anytwo<2)
replace disease = 0 if a==0
replace disease =. if a==.

//Comparison
compare disease_true disease
as  disease_true ==disease
*************

Nick Cox

I think you need to be clear whether missing means true, false or
indeterminate as far as this is concerned. 

Setting aside missings, as a, b, c, d, e are Booleans (1 = true, 0 =
false) then 

gen disease_true = a & (b + c + d + e >= 2) 

is one way to do it. If missings make the problem indeterminate then
tack on 

... if !missing(a, b, c, d, e) 

Nick 
n.j.cox@durham.ac.uk 

Thomas Speidel

Following up on my previous post:  
http://www.stata.com/statalist/archive/2010-06/msg00984.html
here is an example for something I am trying to achieve in a  
nice/efficient/eleganty way.

I have a number of dummies: a, b, c, d, e (missing values do exist)
Disease=true if the following conditions are met:

1) a must be true AND
2) any two of b, c, d, e are true

As I said missing values are crucial, especially when evaluating the  
second condition.

My current program works, but I don't think it is efficient and it  
probably does things that are unnecessary:

*******************************************
egen anytwo = rowtotal(a b c d e), missing
egen missing = rowmiss(a b c d e)
replace anytwo = . if (anytwo==0 & missing>=2 & missing<.)
replace anytwo = . if (anytwo==1 & missing==1)
replace anytwo = . if (anytwo==1 & missing==3)
replace anytwo = . if (missing>=4)

gen disease = 1 if (a==1 & anytwo>=2 & anytwo<.)
replace disease = 0 if (a==1 & anytwo<2)
replace disease = 0 if a==0
replace disease =. if a==.
*******************************************

I tried to play around with cond, but I found it was making this much  
more complicated then it is.  I know I am complicating my life more  
than I need to which is why I am looking for alternative solutions.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index