I have survival time data about sickness spells, in the following form:

personid     startdate     stopdate
1            01mai1997    07dec1997
1            28jan2002    09feb2002
2            31jul1994    06mar1998
N            31dec2002    (sensored)


What I need is a table a) with prevalences for each day :

month            spersons
01jan1994            897
02jan1994            789
31dec2002            987


and a table b) of person-days of sickness for each month through the period
of interest:

month              pdays
jan1994            22345
feb1994            24567
dec2002            26789


I believe I will have my a) data set thusly:

forvalues x=12419/15705 {
quietly stdes if startdate<=`x' & stopdate>`x'
di r[N_sub]

So to the real problem: The data set has more than 5 million records.
Looping through thousands of days is slow, partly because stdes doea a lot
of work, and I need to repeat it a lot of times as different versions of the
data are produced. Is there a more efficient method?

