Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: calculating cumulative exposure


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: calculating cumulative exposure
Date   Fri, 3 Aug 2007 18:44:54 +0100

Others have addressed the question of efficiency, 
advising the use of -by:-. 

Expanding one comment made elsewhere and making
another:

1. -sort id- could be a little dangerous if 
you have no time variable. You should check 
out the -stable- option. 

2. Just in case it is not clear, your code
below is incorrect as well as inefficient. 
It will only give correct answers for 
the first three observations. Perhaps you 
meant it as a sketch, but programmers tend
to take code literally.... 

Nick 
n.j.cox@durham.ac.uk 

Raoul C Reulen
 
> I‘ve got multiple records per person and want to calculate 
> some kind of cumulative exposure index per person. I’ve got a 
> variable called “exposure” and want to create a new variable 
> called “cum_exposure”. Each observation in the cum_exposure 
> variable should give the sum of all previous cells in the 
> exposure column (but per person). So something like this: 
> 
> Id	Exposure	list		Cum.Exposure
> 1	10		1			10
> 1	14		2			24
> 1	15		3			39
> 2	8		1			8
> 2	10		2			18
> 2	15		3			32
> 
> I’ve tried this:
> 
> .by id: gen list=_n 
> .gen cum_exposure=.
> .replace cum_exposure= exposure[1]  if list==1
> .replace cum_exposure= exposure[1] + exposure [2]  if list==2
> .replace cum_exposure= exposure[1] + exposure [2] + 
> exposure[3]  if list==3
> 
> But how can I do this more efficiently? In a loop for example?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index