Thanks for the replies thus far.

I'd prefer a scalar so I don't have to generate the extra variables with a value attached to each subject.

Based on all the comments you have generously offered, I'm closer to a solution:

forvalues i=1995/1998 {

count if dxflag==1 & year==`i'

scalar cases`i' = r(N)

count if year==`i'

scalar total`i' = r(N)

scalar prev`i' = cases`i'/total`i'

}

The problem is this doesn't give me different denominators (ie, total for each year)...thoughts?

Heather

I know this must be simple, but something eludes me.

I have a data set of subjects (cases and non-cases) by year and am trying to calculate a prevalence per year, ie, [cases/(cases+noncases)].

ie, data look like this:

id year dxflag

1 1995 1

1 1996 0

1 1997 0

2 1995 0

2 1996 1

...

I found that I can write

by year: count if dxflag==1

or

by year: count

and these return number of cases in the first line, and count of total population in the second line, just as I want.

But what I need are these values as scalars, so I can create a prevalence per year. In the Stata 8 help/manual it says that -count- saves the scalar of number of observations in r(N), but I can't seem to get this number to appear. Do I actually have to write a program to calculate this? And how does the "by year" part fit in...?

Thanks for your help,

Heather

