Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: summarize by different levels/groups with -egen- ?

 From "Lovisa Persson" To Subject st: RE: summarize by different levels/groups with -egen- ? Date Fri, 11 Jan 2013 13:01:01 +0100

```Hello,

First create a dummy variable for each pathogen, pathogeni.
Then generate the mean for each class and each pathogen(i) by writing:

egen meanpathogeni=mean(pathogeni), by(class)

every class that now has a certain pathogen in it will have a value of
meanpathogeni higher than zero, and every class that do not have a certain
pathogen in it will have a value of zero.
The observation value will be the same within classes, which is the mean
number of the pathogen in this class.

So now you generate a new dummy variable that equals 1 if the value of
meanpathogeni is higher than one.
Now each class will have the same observation value which will be 1 or 0
depending on whether this class had at least one observation of this
particular pathogen in it.

Good luck!

Lovisa

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Patricia
Biedermann
Sent: den 11 januari 2013 12:11
To: statalist@hsphsun2.harvard.edu
Subject: st: summarize by different levels/groups with -egen- ?

Dear STATA users,

I want to summarize following:

School		Class		Pathogen
A			A1			H
A			A1			T
A			A1			H
A			A2			S
A			A2			H
A			A3			K
A			A3			I
B			B1			S
B			B1			T
B			B2			H

I've visited different classes in different schools. In each class I checked
if the children were infected with some kind of pathogen.
-	I found e.g that in class A1 two children were infected with
pathogen H.
-	Now, I want to summarize that I just found pathogen H in class A1
WITHOUT the actual amount of pathogen itself (2 times in this case);
Basically "Was pathogen H found in class A1" = yes or no; Finally, the
information should be presented at school level. ("How many classes in
school A pathogen H was found?)

So far I tried egen, bysort / =_n==N and commands. I also created dummy
variables for each pathogen.  It never worked out the right way.
Maybe it's just an error in reasoning.

THANKS A LOT in advance.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```