[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Reshape-Like Question |

Date |
Wed, 4 Mar 2009 17:38:04 -0000 |

I see. I'll stick to my stance and discuss how to get measures working from your long structure. The two most obviously tricky details in your problem are 1. The stipulation of non-consecutive days. (This presumably arises because consecutive days are thought likely to be dependent.) 2. The use of 30 day periods when data are likely to be at least a little irregular in time. I'll focus on period means, each of daily means for blood glucose. You want something else, but at best the "something else" is not your major problem", but rather the two features singled out above. First, I get those daily means and count how many measurements they are based on bysort pid datestamp : gen mean = sum(bglevel) by pid datestamp : gen N = sum(bglevel < .) by pid datestamp : replace mean = mean[_N] / N[_N] Now keep one observation for each day. We keep the _last_ for each day as that contains not just the mean -mean- but also the number of measurements -N-. by pid datestamp : keep if _n == _N Now drop consecutive days, interpreted as any day following another day: by pid : drop if datestamp == datestamp[_n-1] + 1 Here is a brute force way of averaging over the previous 30 days. We keep track of how many days each average is based on and how many of those days included 3 or more measurements. The technique is written up within SJ-7-3 pr0033 . . . . . . . . . . . . . . Stata tip 51: Events in intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q3/07 SJ 7(3):440--443 (no commands) tip for counting or summarizing irregularly spaced events in intervals gen mean = . gen Ndays = . gen Nge3days = . qui forval i = 1/`=_N' { su mean if N >= 3 & inrange(date[`i'] - date, 1, 30) /// & pid == pid[`i'] replace mean = r(mean) in `i' count if inrange(date[`i'] - date, 1, 30) & pid == pid[`i'] replace Ndays = r(N) in `i' count if N >= 3 & inrange(date[`i' - date, 1, 30) & pid == pid[`i'] replace Nge3days = r(N) } At its broadest, the idea is mundane: Initialise variables to be calculated Loop over observations { -count- or calculate whatever is of interest for observations in the same panel within a specified time interval -replace- variables with results obtained } Nick n.j.cox@durham.ac.uk Alan Neustadtl Yes...my thanks to Scott and Nick by pointing out the crucial fact of creating a unique identifier. I tried that, but incorrectly specified the reshape, then dropped the id and incorrectly specified the reshape. I had all the pieces to the puzzle but couldn't put them together until I was pointed in the correct direction. As for Nick's other comments, he is probably right that it may be possible to work column wise on the data and my limits might be in seeing the big picture. What I am trying to do is create a symmetrical measure of blood glucose variability called the "average daily risk range" (ADRR). The measure requires that each participant has a minimum of three blood glucose readings for at least 14 nonconsecutive days of readings in a 30 day period. Using rowwise egen commands gave me some leverage on identifying the relevant patients. I tried using -by- and -collapse- to come to the same place (one risk range measure per patient/day) but eventually became lost in the details and worked this into a -reshape- problem. I am open to learning new things so please if you have time I would appreciate other attacks on my problem. On Wed, Mar 4, 2009 at 8:51 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > Scott's example code underlines that this is indeed a -reshape- problem and that you just need the one trick of creating an identifier that will be used for the columns. > > Other trickery in this territory is detailed at > > FAQ . . . . . . . . . . . . . . . . . . . . . . . . Problems with reshape > 12/03 I am having problems with the reshape command. Can > you give further guidance? > http://www.stata.com/support/faqs/data/reshape3.html > > > But a bigger question is why you want to do this. On the whole you are better off with your existing data structure. Although working rowwise is possible and often natural, as will be explored in some detail in Stata Journal 9(1) 2009, it is difficult to think of anything easier with your new data structure. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Reshape-Like Question***From:*Alan Neustadtl <alan.neustadtl@gmail.com>

**Re: st: Reshape-Like Question***From:*Scott Merryman <scott.merryman@gmail.com>

**RE: st: Reshape-Like Question***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: Reshape-Like Question***From:*Alan Neustadtl <alan.neustadtl@gmail.com>

- Prev by Date:
**st: Problem with zero counts when using xtmepoisson** - Next by Date:
**st: Biased estimates?** - Previous by thread:
**Re: st: Reshape-Like Question** - Next by thread:
**st: mfx with tobit** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |