Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: summarize by different levels/groups with -egen- ? |

Date |
Fri, 11 Jan 2013 12:46:17 +0000 |

You don't need a dummy or indicator variable. Assuming that -pathogen- is a string variable, ... mean(pathogen == "H") will work fine as the -mean()- function of -egen- takes expressions. If it's a numeric variable, the same principle applies, but you need a different expression. Nick On Fri, Jan 11, 2013 at 12:01 PM, Lovisa Persson <lovisa.persson@nek.uu.se> wrote: > First create a dummy variable for each pathogen, pathogeni. > Then generate the mean for each class and each pathogen(i) by writing: > > egen meanpathogeni=mean(pathogeni), by(class) > > every class that now has a certain pathogen in it will have a value of > meanpathogeni higher than zero, and every class that do not have a certain > pathogen in it will have a value of zero. > The observation value will be the same within classes, which is the mean > number of the pathogen in this class. > > So now you generate a new dummy variable that equals 1 if the value of > meanpathogeni is higher than one. > Now each class will have the same observation value which will be 1 or 0 > depending on whether this class had at least one observation of this > particular pathogen in it. Patricia Biedermann > I want to summarize following: > > School Class Pathogen > A A1 H > A A1 T > A A1 H > A A2 S > A A2 H > A A3 K > A A3 I > B B1 S > B B1 T > B B2 H > > I've visited different classes in different schools. In each class I checked > if the children were infected with some kind of pathogen. > - I found e.g that in class A1 two children were infected with > pathogen H. > - Now, I want to summarize that I just found pathogen H in class A1 > WITHOUT the actual amount of pathogen itself (2 times in this case); > Basically "Was pathogen H found in class A1" = yes or no; Finally, the > information should be presented at school level. ("How many classes in > school A pathogen H was found?) > > So far I tried egen, bysort / =_n==N and commands. I also created dummy > variables for each pathogen. It never worked out the right way. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: summarize by different levels/groups with -egen- ?***From:*Patricia Biedermann <pati.stat@gmail.com>

**References**:**st: summarize by different levels/groups with -egen- ?***From:*Patricia Biedermann <pati.stat@gmail.com>

**st: RE: summarize by different levels/groups with -egen- ?***From:*"Lovisa Persson" <lovisa.persson@nek.uu.se>

- Prev by Date:
**st: matching cases by a transitive relation** - Next by Date:
**Re: st: why don't confidence intervals from -proportion- use the same formula as -ci-?** - Previous by thread:
**st: RE: summarize by different levels/groups with -egen- ?** - Next by thread:
**Re: st: RE: summarize by different levels/groups with -egen- ?** - Index(es):