Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Patricia Biedermann <pati.stat@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: summarize by different levels/groups with -egen- ? |

Date |
Fri, 11 Jan 2013 17:39:59 +0100 |

Hello, Thank you Lovisa & Nick. I've tried your commands, but it seems not to work out the way I want to have it. (pathogen is a string variable). The issue is that, when I creat the dummy variable in the end (as described by Lovisa) I will get for each H in one class a "1". When I further summarize it, I have the total amount of H. But I want to have a total amount of classes, who are affected with H (regardless how many children itself were affected by the pathogen). e.g. Class Pathogen A1 H A1 S A1 T A2 S A2 K A3 H A3 D B1 H B1 S 0 Finally --> 3 (out of 4) classes are affected by "H". (I don't care about how many individuals in one class!). Maybe I've to think about it and approach it differently. Cheers. On Fri, Jan 11, 2013 at 1:46 PM, Nick Cox <njcoxstata@gmail.com> wrote: > You don't need a dummy or indicator variable. Assuming that -pathogen- > is a string variable, > > ... mean(pathogen == "H") > > will work fine as the -mean()- function of -egen- takes expressions. > If it's a numeric variable, the same principle applies, but you need a > different expression. > > Nick > > On Fri, Jan 11, 2013 at 12:01 PM, Lovisa Persson > <lovisa.persson@nek.uu.se> wrote: > >> First create a dummy variable for each pathogen, pathogeni. >> Then generate the mean for each class and each pathogen(i) by writing: >> >> egen meanpathogeni=mean(pathogeni), by(class) >> >> every class that now has a certain pathogen in it will have a value of >> meanpathogeni higher than zero, and every class that do not have a certain >> pathogen in it will have a value of zero. >> The observation value will be the same within classes, which is the mean >> number of the pathogen in this class. >> >> So now you generate a new dummy variable that equals 1 if the value of >> meanpathogeni is higher than one. >> Now each class will have the same observation value which will be 1 or 0 >> depending on whether this class had at least one observation of this >> particular pathogen in it. > > Patricia Biedermann > >> I want to summarize following: >> >> School Class Pathogen >> A A1 H >> A A1 T >> A A1 H >> A A2 S >> A A2 H >> A A3 K >> A A3 I >> B B1 S >> B B1 T >> B B2 H >> >> I've visited different classes in different schools. In each class I checked >> if the children were infected with some kind of pathogen. >> - I found e.g that in class A1 two children were infected with >> pathogen H. >> - Now, I want to summarize that I just found pathogen H in class A1 >> WITHOUT the actual amount of pathogen itself (2 times in this case); >> Basically "Was pathogen H found in class A1" = yes or no; Finally, the >> information should be presented at school level. ("How many classes in >> school A pathogen H was found?) >> >> So far I tried egen, bysort / =_n==N and commands. I also created dummy >> variables for each pathogen. It never worked out the right way. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: summarize by different levels/groups with -egen- ?***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

**References**:**st: summarize by different levels/groups with -egen- ?***From:*Patricia Biedermann <pati.stat@gmail.com>

**st: RE: summarize by different levels/groups with -egen- ?***From:*"Lovisa Persson" <lovisa.persson@nek.uu.se>

**Re: st: RE: summarize by different levels/groups with -egen- ?***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: why don't confidence intervals from -proportion- use the same formula as -ci-?** - Next by Date:
**st: Fixed Effects estimation with time-invariant variables** - Previous by thread:
**Re: st: RE: summarize by different levels/groups with -egen- ?** - Next by thread:
**Re: st: RE: summarize by different levels/groups with -egen- ?** - Index(es):