Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Survival time to prevalence data - efficient code?


From   "Arne Kolstad" <[email protected]>
To   <[email protected]>
Subject   st: Survival time to prevalence data - efficient code?
Date   Tue, 9 Sep 2003 00:05:30 +0200

I have survival time data about sickness spells, in the following form:

personid     startdate     stopdate
1            01mai1997    07dec1997
1            28jan2002    09feb2002
2            31jul1994    06mar1998
.
.
N            31dec2002    (sensored)

---


What I need is a table a) with prevalences for each day :

month            spersons
01jan1994            897
02jan1994            789
.
.
31dec2002            987

---

and a table b) of person-days of sickness for each month through the period
of interest:


month              pdays
jan1994            22345
feb1994            24567
.
.
dec2002            26789

---


I believe I will have my a) data set thusly:

forvalues x=12419/15705 {
quietly stdes if startdate<=`x' & stopdate>`x'
di r[N_sub]
}

So to the real problem: The data set has more than 5 million records.
Looping through thousands of days is slow, partly because stdes doea a lot
of work, and I need to repeat it a lot of times as different versions of the
data are produced. Is there a more efficient method?




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index