[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: calculating cumulative exposure |

Date |
Fri, 3 Aug 2007 18:44:54 +0100 |

Others have addressed the question of efficiency, advising the use of -by:-. Expanding one comment made elsewhere and making another: 1. -sort id- could be a little dangerous if you have no time variable. You should check out the -stable- option. 2. Just in case it is not clear, your code below is incorrect as well as inefficient. It will only give correct answers for the first three observations. Perhaps you meant it as a sketch, but programmers tend to take code literally.... Nick n.j.cox@durham.ac.uk Raoul C Reulen > I‘ve got multiple records per person and want to calculate > some kind of cumulative exposure index per person. I’ve got a > variable called “exposure” and want to create a new variable > called “cum_exposure”. Each observation in the cum_exposure > variable should give the sum of all previous cells in the > exposure column (but per person). So something like this: > > Id Exposure list Cum.Exposure > 1 10 1 10 > 1 14 2 24 > 1 15 3 39 > 2 8 1 8 > 2 10 2 18 > 2 15 3 32 > > I’ve tried this: > > .by id: gen list=_n > .gen cum_exposure=. > .replace cum_exposure= exposure[1] if list==1 > .replace cum_exposure= exposure[1] + exposure [2] if list==2 > .replace cum_exposure= exposure[1] + exposure [2] + > exposure[3] if list==3 > > But how can I do this more efficiently? In a loop for example? * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

