[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: coding problem: looping through a list of ID's

From   Jeph Herrin <>
Subject   Re: st: RE: coding problem: looping through a list of ID's
Date   Mon, 24 Nov 2008 17:02:26 -0500

I think my solution does what you want, but better in that
the best approximation to the increase in height per 365 days
is the average increase/day between visits, multiplied by

Stata stores the dates of visit as consecutive integers counting
number of days. Thus, the difference in height between visits
divided by the difference in dates will be the change in height
per day between those visits. Do this for each visit, and then
average over visits. It only needs two lines of code, per my
solution, though to get in years you will want to multiply by
the number of days in a year.

  bysort ID (dov) : gen rate=(height-height[_n-1])/(dov-dov[_n-1])
  bysort ID       : egen average=mean(rate*365.25)

If you truly want to ignore the information in the visits that
are less than a year apart, and only look at the growth between
visits that are "closest" to one year from the first visit, try

 * put date in years
 gen year=dov/365.25

 * get gap to nearest whole years from first visit
 bysort ID (dov) : gen gap = abs(round(year-year[1])-(year-year[1]))

 * use the visit if closer than neighbor visits to one year
 bysort ID (dov) : gen use = gap<gap[_n-1]&gap<=gap[_n+1]

where I've made the last inequality soft to deal with potential ties.
Now just keep the ones you're using, and apply solution above, only
now the denominator is assumed to be 1, though of course that's just
a rough approximation:

 keep if use
 bysort ID (dov) : gen rate = height-height[1]
 bysort ID       : egen average=mean(rate)

where the second line gives you the average for the subject over
all of their years.

hope this helps,

Leny Mathew wrote:
Thanks Nick & Jeff for your suggestions. I think that I was perhaps
not clear enough with what I was trying to do. The following is the
data for one of the patients:

pt_id         dov	          pt_ht	   age
2	       	24-May-01  141  	  12.92519	
2	       	31-May-01	 141	          12.94441	
2	       	11-Sep-01	 141	          13.22718	
2	       	11-Dec-01	 145	          13.47701	
2	      	21-Feb-02	 146	          13.67467	
2	      	2-May-02	 147	          13.86685	
2	     	11-Jul-02	 149	          14.05903	
2	     	21-Nov-02	 152	          14.42416
2	     	21-Jan-03	 152.3	  14.59163	
2	     	10-Apr-03	 153.7	 14.80851	
2	     	1-May-03	 153.4	 14.86616	
2	     	1-Jul-03	 153.8	 15.03363	
2	     	9-Sep-03	 154.8	 15.22581	
2	     	18-Nov-03	 154.8	 15.41798	
2	     	20-Jan-04	 156	         15.59094	
2	     	18-Mar-04	 157	         15.75017
2	     	20-May-04	 156	         15.92313	
2	     	15-Jul-04	 157	         16.07687	
2	     	16-Sep-04	 157	          16.24983	
2	     	11-Nov-04	 158	          16.40357	

I'm trying to find out how much this patient grew in a year. So the
way I thought of it is to find out the interval of time that
approximates an year and then compute the difference in height between
those two points. So, taking the difference between _n and _n-1 will
not suffice but it has to be _n and _n-i when i goes from 1 to _N
within patient.
 For example the first time point is 24 May 01 and the time that
approximates a year is 2 may 02. So, the height increase would be
Then the next one would be from 2 May 02 to 1 May 03 and so on.
The patient number 3 has 105 visits from 1988 to 2000, so the number
of intervals would be more compared to ID 2. I would most probably end
up using the most recent one year height increase for analysis
purposes, but that doesn't make this any easier.

I hope this clarifies things better.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index