Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: AW: RE: Evaluating a set of conditions
From 
 
Thomas Speidel <[email protected]> 
To 
 
[email protected] 
Subject 
 
RE: st: RE: AW: RE: Evaluating a set of conditions 
Date 
 
Wed, 23 Jun 2010 16:08:40 -0600 
Forcing myself to use -cond-, I get one step closer, yet not quite there yet:
(in the code below I added one more obs at id==19)
======================
input byte(id a b c d e disease)
 1   0   0   1   0   0
 2   1   0   1   1   0
 3   .   1   1   1   1
 4   0   1   1   0   1
 5   1   0   0   0   0
 6   1   .   1   1   0
 7   0   0   0   0   0
 8   1   .   .   .   1
 9   1   0   0   0   0
10   1   .   .   1   1
11   1   .   1   0   0
12   1   0   1   0   0
13   1   0   1   0   0
14   .   0   1   0   0
15   1   0   1   0   0
16   1   0   1   1   0
17   0   0   1   0   0
18   1   .   .   0   .
19   0   .   .   .   1
end
egen missing = rowmiss(b c d e)
gen byte test =cond(missing(a), ., cond(a==1, cond(missing<=1, (b + c  
+ d + e)>=2, cond((b + c + d + e)>=2, 1, .)), 0))
. assert disease==test
3 contradictions in 19 observations
assertion is false
=======================
this seems to fail whenever a==1 & missing==3
Thomas
Quoting Thomas Speidel <[email protected]> Wed 23 Jun 13:13:54 2010:
While trying to simplify the problem for the list (my variables are  
not actually called a, b, c, etc) I must have inadvertantly  
introduced some problems.  Sorry for the confusion.
Nonetheless, the variable called "disease" in the n=18 dataset is  
indeed what I am trying to achieve.
Thomas
Quoting Martin Weiss <[email protected]> Wed 23 Jun 13:02:40 2010:
<>
Your own code returns "1" for id==11. Have you changed your mind?
***********
clear*
inp byte(id  a b c d e)
1   0   0   1   0   0
2   1   0   1   1   0
3   .   1   1   1   1
4   0   1   1   0   1
5   1   0   0   0   0
6   1   .   1   1   0
7   0   0   0   0   0
8   1   .   .   .   1
9   1   0   0   0   0
10   1   .   .   1   1
11   1   .   1   0   0
12   1   0   1   0   0
13   1   0   1   0   0
14   .   0   1   0   0
15   1   0   1   0   0
16   1   0   1   1   0
17   0   0   1   0   0
18   1   .   .   0   .
end
egen anytwo = rowtotal(a b c d e), missing
egen missing = rowmiss(a b c d e)
replace anytwo = . if (anytwo==0 & missing>=2 & missing<.)
replace anytwo = . if (anytwo==1 & missing==1)
replace anytwo = . if (anytwo==1 & missing==3)
replace anytwo = . if (missing>=4)
gen disease = 1 if (a==1 & anytwo>=2 & anytwo<.)
replace disease = 0 if (a==1 & anytwo<2)
replace disease = 0 if a==0
replace disease =. if a==.
list in 11, noo
***********
HTH
Martin
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Thomas Speidel
Sent: Mittwoch, 23. Juni 2010 17:10
To: [email protected]
Subject: Re: st: RE: AW: RE: Evaluating a set of conditions
Thanks Martin and Nick.  Here is an example where I have added more
missing and manually created "disease" to clarify how the missing
would impact the results:
    id   a   b   c   d   e   disease
     1   0   0   1   0   0         0
     2   1   0   1   1   0         1
     3   .   1   1   1   1         .
     4   0   1   1   0   1         0
     5   1   0   0   0   0         0
     6   1   .   1   1   0         1
     7   0   0   0   0   0         0
     8   1   .   .   .   1         .
     9   1   0   0   0   0         0
    10   1   .   .   1   1         1
    11   1   .   1   0   0         .
    12   1   0   1   0   0         0
    13   1   0   1   0   0         0
    14   .   0   1   0   0         .
    15   1   0   1   0   0         0
    16   1   0   1   1   0         1
    17   0   0   1   0   0         0
    18   1   .   .   0   .         .
Take a look at id==11 for example, where I don't have enough
information to determine disease presence.
Thomas Speidel
Quoting Nick Cox <[email protected]> Wed 23 Jun 06:59:44 2010:
Yes, if there are missings it's more complicated than my initial answer
could suggest.
(a == 1) & (((b == 1) + (c ==1) + (d == 1) + (e == 1)) >= 2)
would seem to match the possibilities better.
Nick
[email protected]
Martin Weiss
The result does seem to differ much, though, from the one Thomas
evidently
wants - as expressed by his example:
*************
clear*
set obs 10000
set seed 12345
foreach var of newlist a b c d e{
	gen byte `var'=runiform()<.5
	replace `var'=. if runiform()<.15
}
//NJC
gen disease_true = a & (b + c + d + e >= 2) /*
*/  if !missing(a, b, c, d, e)
//Thomas
egen anytwo = rowtotal(a b c d e), missing
egen missing = rowmiss(a b c d e)
replace anytwo = . if (anytwo==0 & missing>=2 & missing<.)
replace anytwo = . if (anytwo==1 & missing==1)
replace anytwo = . if (anytwo==1 & missing==3)
replace anytwo = . if (missing>=4)
gen disease = 1 if (a==1 & anytwo>=2 & anytwo<.)
replace disease = 0 if (a==1 & anytwo<2)
replace disease = 0 if a==0
replace disease =. if a==.
//Comparison
compare disease_true disease
as  disease_true ==disease
*************
Nick Cox
I think you need to be clear whether missing means true, false or
indeterminate as far as this is concerned.
Setting aside missings, as a, b, c, d, e are Booleans (1 = true, 0 =
false) then
gen disease_true = a & (b + c + d + e >= 2)
is one way to do it. If missings make the problem indeterminate then
tack on
... if !missing(a, b, c, d, e)
Nick
[email protected]
Thomas Speidel
Following up on my previous post:
http://www.stata.com/statalist/archive/2010-06/msg00984.html
here is an example for something I am trying to achieve in a
nice/efficient/eleganty way.
I have a number of dummies: a, b, c, d, e (missing values do exist)
Disease=true if the following conditions are met:
1) a must be true AND
2) any two of b, c, d, e are true
As I said missing values are crucial, especially when evaluating the
second condition.
My current program works, but I don't think it is efficient and it
probably does things that are unnecessary:
*******************************************
egen anytwo = rowtotal(a b c d e), missing
egen missing = rowmiss(a b c d e)
replace anytwo = . if (anytwo==0 & missing>=2 & missing<.)
replace anytwo = . if (anytwo==1 & missing==1)
replace anytwo = . if (anytwo==1 & missing==3)
replace anytwo = . if (missing>=4)
gen disease = 1 if (a==1 & anytwo>=2 & anytwo<.)
replace disease = 0 if (a==1 & anytwo<2)
replace disease = 0 if a==0
replace disease =. if a==.
*******************************************
I tried to play around with cond, but I found it was making this much
more complicated then it is.  I know I am complicating my life more
than I need to which is why I am looking for alternative solutions.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
--
Thomas Speidel
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
--
Thomas Speidel
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
--
Thomas Speidel
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/