Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Joerg Luedicke <joerg.luedicke@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: RE: summarize by different levels/groups with -egen- ? |
Date | Fri, 11 Jan 2013 12:25:00 -0500 |
Consider the following: // Data clear input str2 Class str1 Pathogen A1 H A1 S A1 T A2 S A2 K A3 H A3 D B1 H B1 S end // Flagging classes with at least one H bys Class: egen pat2=max(Pathogen=="H") // To analyze that at class level bys Class: gen tag=_n==1 keep if tag Joerg On Fri, Jan 11, 2013 at 11:39 AM, Patricia Biedermann <pati.stat@gmail.com> wrote: > Hello, > Thank you Lovisa & Nick. > I've tried your commands, but it seems not to work out the way I want > to have it. (pathogen is a string variable). > > The issue is that, when I creat the dummy variable in the end (as > described by Lovisa) I will get for each H in one class a "1". When I > further summarize it, I have the total amount of H. But I want to have > a total amount of classes, who are affected with H (regardless how > many children itself were affected by the pathogen). > > e.g. > Class Pathogen > A1 H > A1 S > A1 T > A2 S > A2 K > A3 H > A3 D > B1 H > B1 S 0 > > Finally --> 3 (out of 4) classes are affected by "H". (I don't care > about how many individuals in one class!). > > Maybe I've to think about it and approach it differently. > Cheers. > > On Fri, Jan 11, 2013 at 1:46 PM, Nick Cox <njcoxstata@gmail.com> wrote: >> You don't need a dummy or indicator variable. Assuming that -pathogen- >> is a string variable, >> >> ... mean(pathogen == "H") >> >> will work fine as the -mean()- function of -egen- takes expressions. >> If it's a numeric variable, the same principle applies, but you need a >> different expression. >> >> Nick >> >> On Fri, Jan 11, 2013 at 12:01 PM, Lovisa Persson >> <lovisa.persson@nek.uu.se> wrote: >> >>> First create a dummy variable for each pathogen, pathogeni. >>> Then generate the mean for each class and each pathogen(i) by writing: >>> >>> egen meanpathogeni=mean(pathogeni), by(class) >>> >>> every class that now has a certain pathogen in it will have a value of >>> meanpathogeni higher than zero, and every class that do not have a certain >>> pathogen in it will have a value of zero. >>> The observation value will be the same within classes, which is the mean >>> number of the pathogen in this class. >>> >>> So now you generate a new dummy variable that equals 1 if the value of >>> meanpathogeni is higher than one. >>> Now each class will have the same observation value which will be 1 or 0 >>> depending on whether this class had at least one observation of this >>> particular pathogen in it. >> >> Patricia Biedermann >> >>> I want to summarize following: >>> >>> School Class Pathogen >>> A A1 H >>> A A1 T >>> A A1 H >>> A A2 S >>> A A2 H >>> A A3 K >>> A A3 I >>> B B1 S >>> B B1 T >>> B B2 H >>> >>> I've visited different classes in different schools. In each class I checked >>> if the children were infected with some kind of pathogen. >>> - I found e.g that in class A1 two children were infected with >>> pathogen H. >>> - Now, I want to summarize that I just found pathogen H in class A1 >>> WITHOUT the actual amount of pathogen itself (2 times in this case); >>> Basically "Was pathogen H found in class A1" = yes or no; Finally, the >>> information should be presented at school level. ("How many classes in >>> school A pathogen H was found?) >>> >>> So far I tried egen, bysort / =_n==N and commands. I also created dummy >>> variables for each pathogen. It never worked out the right way. >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/