Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Thomas Speidel <thomas@tmbx.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | RE: st: RE: AW: RE: Evaluating a set of conditions |
Date | Wed, 23 Jun 2010 16:58:03 -0600 |
I have fixed the missing disease below.I am trying to have as few lines of code as possible and avoid a series of replace. Replacements are easy to follow but they create a hierarchy that is hard to keep track of and error prone. Below I have purposly created a series of replacements. I am looking for the shortest possible way to achieve this that I often see in posts by you and Nick...
====================== input byte(id a b c d e disease) 1 0 0 1 0 0 0 2 1 0 1 1 0 1 3 . 1 1 1 1 . 4 0 1 1 0 1 0 5 1 0 0 0 0 0 6 1 . 1 1 0 1 7 0 0 0 0 0 0 8 1 . . . 1 . 9 1 0 0 0 0 0 10 1 . . 1 1 1 11 1 . 1 0 0 . 12 1 0 1 0 0 0 13 1 0 1 0 0 0 14 . 0 1 0 0 . 15 1 0 1 0 0 0 16 1 0 1 1 0 1 17 0 0 1 0 0 0 18 1 . . 0 . . 19 0 . . . 1 0 end egen missing = rowmiss(b c d e) egen sum = rowtotal(b c d e) gen byte test = 1 if a==1 & missing<=1 & sum>=2 replace test = 0 if a==1 & missing<=1 & sum<2 replace test = . if a==1 & missing==1 & sum==1 replace test = 1 if a==1 & missing==2 & sum>=2 replace test = . if a==1 & missing==2 & sum<2 replace test = . if a==1 & missing>=3 replace test = 0 if a==0 . assert disease==test . ====================== Quoting Martin Weiss <martin.weiss1@gmx.de> Wed 23 Jun 16:22:33 2010:
<> In your -input- call, the "disease" variable seems to be missing. Anyway, if the only problem is with a certain combination, why not add *********** replace test=. if a==1 & missing==3 *********** HTH Martin -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Thomas Speidel Sent: Donnerstag, 24. Juni 2010 00:09 To: statalist@hsphsun2.harvard.edu Subject: RE: st: RE: AW: RE: Evaluating a set of conditions Forcing myself to use -cond-, I get one step closer, yet not quite there yet: (in the code below I added one more obs at id==19) ====================== input byte(id a b c d e disease) 1 0 0 1 0 0 2 1 0 1 1 0 3 . 1 1 1 1 4 0 1 1 0 1 5 1 0 0 0 0 6 1 . 1 1 0 7 0 0 0 0 0 8 1 . . . 1 9 1 0 0 0 0 10 1 . . 1 1 11 1 . 1 0 0 12 1 0 1 0 0 13 1 0 1 0 0 14 . 0 1 0 0 15 1 0 1 0 0 16 1 0 1 1 0 17 0 0 1 0 0 18 1 . . 0 . 19 0 . . . 1 end egen missing = rowmiss(b c d e) gen byte test =cond(missing(a), ., cond(a==1, cond(missing<=1, (b + c + d + e)>=2, cond((b + c + d + e)>=2, 1, .)), 0)) . assert disease==test 3 contradictions in 19 observations assertion is false ======================= this seems to fail whenever a==1 & missing==3 Thomas Quoting Thomas Speidel <thomas@tmbx.com> Wed 23 Jun 13:13:54 2010:While trying to simplify the problem for the list (my variables are not actually called a, b, c, etc) I must have inadvertantly introduced some problems. Sorry for the confusion. Nonetheless, the variable called "disease" in the n=18 dataset is indeed what I am trying to achieve. Thomas Quoting Martin Weiss <martin.weiss1@gmx.de> Wed 23 Jun 13:02:40 2010:<> Your own code returns "1" for id==11. Have you changed your mind? *********** clear* inp byte(id a b c d e) 1 0 0 1 0 0 2 1 0 1 1 0 3 . 1 1 1 1 4 0 1 1 0 1 5 1 0 0 0 0 6 1 . 1 1 0 7 0 0 0 0 0 8 1 . . . 1 9 1 0 0 0 0 10 1 . . 1 1 11 1 . 1 0 0 12 1 0 1 0 0 13 1 0 1 0 0 14 . 0 1 0 0 15 1 0 1 0 0 16 1 0 1 1 0 17 0 0 1 0 0 18 1 . . 0 . end egen anytwo = rowtotal(a b c d e), missing egen missing = rowmiss(a b c d e) replace anytwo = . if (anytwo==0 & missing>=2 & missing<.) replace anytwo = . if (anytwo==1 & missing==1) replace anytwo = . if (anytwo==1 & missing==3) replace anytwo = . if (missing>=4) gen disease = 1 if (a==1 & anytwo>=2 & anytwo<.) replace disease = 0 if (a==1 & anytwo<2) replace disease = 0 if a==0 replace disease =. if a==. list in 11, noo *********** HTH Martin -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Thomas Speidel Sent: Mittwoch, 23. Juni 2010 17:10 To: statalist@hsphsun2.harvard.edu Subject: Re: st: RE: AW: RE: Evaluating a set of conditions Thanks Martin and Nick. Here is an example where I have added more missing and manually created "disease" to clarify how the missing would impact the results: id a b c d e disease 1 0 0 1 0 0 0 2 1 0 1 1 0 1 3 . 1 1 1 1 . 4 0 1 1 0 1 0 5 1 0 0 0 0 0 6 1 . 1 1 0 1 7 0 0 0 0 0 0 8 1 . . . 1 . 9 1 0 0 0 0 0 10 1 . . 1 1 1 11 1 . 1 0 0 . 12 1 0 1 0 0 0 13 1 0 1 0 0 0 14 . 0 1 0 0 . 15 1 0 1 0 0 0 16 1 0 1 1 0 1 17 0 0 1 0 0 0 18 1 . . 0 . . Take a look at id==11 for example, where I don't have enough information to determine disease presence. Thomas Speidel Quoting Nick Cox <n.j.cox@durham.ac.uk> Wed 23 Jun 06:59:44 2010:Yes, if there are missings it's more complicated than my initial answer could suggest. (a == 1) & (((b == 1) + (c ==1) + (d == 1) + (e == 1)) >= 2) would seem to match the possibilities better. Nick n.j.cox@durham.ac.uk Martin Weiss The result does seem to differ much, though, from the one Thomas evidently wants - as expressed by his example: ************* clear* set obs 10000 set seed 12345 foreach var of newlist a b c d e{ gen byte `var'=runiform()<.5 replace `var'=. if runiform()<.15 } //NJC gen disease_true = a & (b + c + d + e >= 2) /* */ if !missing(a, b, c, d, e) //Thomas egen anytwo = rowtotal(a b c d e), missing egen missing = rowmiss(a b c d e) replace anytwo = . if (anytwo==0 & missing>=2 & missing<.) replace anytwo = . if (anytwo==1 & missing==1) replace anytwo = . if (anytwo==1 & missing==3) replace anytwo = . if (missing>=4) gen disease = 1 if (a==1 & anytwo>=2 & anytwo<.) replace disease = 0 if (a==1 & anytwo<2) replace disease = 0 if a==0 replace disease =. if a==. //Comparison compare disease_true disease as disease_true ==disease ************* Nick Cox I think you need to be clear whether missing means true, false or indeterminate as far as this is concerned. Setting aside missings, as a, b, c, d, e are Booleans (1 = true, 0 = false) then gen disease_true = a & (b + c + d + e >= 2) is one way to do it. If missings make the problem indeterminate then tack on ... if !missing(a, b, c, d, e) Nick n.j.cox@durham.ac.uk Thomas Speidel Following up on my previous post: http://www.stata.com/statalist/archive/2010-06/msg00984.html here is an example for something I am trying to achieve in a nice/efficient/eleganty way. I have a number of dummies: a, b, c, d, e (missing values do exist) Disease=true if the following conditions are met: 1) a must be true AND 2) any two of b, c, d, e are true As I said missing values are crucial, especially when evaluating the second condition. My current program works, but I don't think it is efficient and it probably does things that are unnecessary: ******************************************* egen anytwo = rowtotal(a b c d e), missing egen missing = rowmiss(a b c d e) replace anytwo = . if (anytwo==0 & missing>=2 & missing<.) replace anytwo = . if (anytwo==1 & missing==1) replace anytwo = . if (anytwo==1 & missing==3) replace anytwo = . if (missing>=4) gen disease = 1 if (a==1 & anytwo>=2 & anytwo<.) replace disease = 0 if (a==1 & anytwo<2) replace disease = 0 if a==0 replace disease =. if a==. ******************************************* I tried to play around with cond, but I found it was making this much more complicated then it is. I know I am complicating my life more than I need to which is why I am looking for alternative solutions. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/-- Thomas Speidel * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/-- Thomas Speidel * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/-- Thomas Speidel * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/
-- Thomas Speidel * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/