Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: summarize by different levels/groups with -egen- ?

From   "Lovisa Persson" <>
To   <>
Subject   st: RE: summarize by different levels/groups with -egen- ?
Date   Fri, 11 Jan 2013 13:01:01 +0100


First create a dummy variable for each pathogen, pathogeni.
Then generate the mean for each class and each pathogen(i) by writing:

egen meanpathogeni=mean(pathogeni), by(class)

every class that now has a certain pathogen in it will have a value of
meanpathogeni higher than zero, and every class that do not have a certain
pathogen in it will have a value of zero.
The observation value will be the same within classes, which is the mean
number of the pathogen in this class.

So now you generate a new dummy variable that equals 1 if the value of
meanpathogeni is higher than one.
Now each class will have the same observation value which will be 1 or 0
depending on whether this class had at least one observation of this
particular pathogen in it.

Was this what you wanted?
Good luck!


-----Original Message-----
[] On Behalf Of Patricia
Sent: den 11 januari 2013 12:11
Subject: st: summarize by different levels/groups with -egen- ?

Dear STATA users,

I want to summarize following:

School		Class		Pathogen		
A			A1			H
A			A1			T
A			A1			H
A			A2			S
A			A2			H
A			A3			K
A			A3			I
B			B1			S
B			B1			T
B			B2			H
I've visited different classes in different schools. In each class I checked
if the children were infected with some kind of pathogen.
-	I found e.g that in class A1 two children were infected with
pathogen H.
-	Now, I want to summarize that I just found pathogen H in class A1
WITHOUT the actual amount of pathogen itself (2 times in this case);
Basically "Was pathogen H found in class A1" = yes or no; Finally, the
information should be presented at school level. ("How many classes in
school A pathogen H was found?)

So far I tried egen, bysort / =_n==N and commands. I also created dummy
variables for each pathogen.  It never worked out the right way.
Maybe it's just an error in reasoning.

THANKS A LOT in advance.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index