Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Variable running totals


From   "Schaffer, Mark E" <[email protected]>
To   <[email protected]>
Subject   RE: st: Variable running totals
Date   Fri, 1 Jun 2012 21:51:42 +0100

Thank you Nick and Jorge Eduardo!  I will forward both your solutions to my colleague.

Cheers,
Mark

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: 01 June 2012 01:59
> To: [email protected]
> Subject: Re: st: Variable running totals
> 
> Looping over observations is easier than might be thought. A 
> discussion at
> 
> SJ-7-3  pr0033  . . . . . . . . . . . . . .  Stata tip 51: 
> Events in intervals
>         . . . . . . . . . . . . . . . . . . . . . . . . . . . 
> . . .  N. J. Cox
>         Q3/07   SJ 7(3):440--443                              
>    (no commands)
>         tip for counting or summarizing irregularly spaced
>         events in intervals
> 
> is accessible at 
> http://www.stata-journal.com/sjpdf.html?articlenum=pr0033
> 
> In this case, consider
> 
> * sandpit
> 
> input id      date    count30
>  1       1000            1
>  1       1002            2
>  1       1002            3
>  1       1200            1
>  1       1250            1
>  2       1050            1
>  2       1059            2
>  2       1085            2
> end
> 
> * solution code
> 
> gen mycount30 = .
> qui forval i = 1/`=_N' {
> 	count if id == id[`i'] & inrange(date, date[`i'] - 30, 
> date[`i'])
> 	replace mycount30 = r(N) in `i'
> }
> 
> I suggest that this code is simpler than Jorge Eduardo's. 
> Relative efficiency will depend on the number of identifiers 
> and the number of observations (and, I suggest, on how long 
> it takes to write code and revise it for related problems).
> 
> 
> 
> Nick
> 
> On Thu, May 31, 2012 at 9:27 PM, Schaffer, Mark E 
> <[email protected]> wrote:
> 
> > Hi all.  "Variable running totals" isn't the best 
> description of the 
> > problem, but it's not too far off.
> >
> > A colleague has written to me with the following problem.  He has a 
> > panel dataset with two variables: id and date.  (He has some other 
> > variables but those are the two that matter.)  There may be 
> multiple 
> > observations on id for a given date.  The date variable is in Stata 
> > %td format (#days after 01jan1960).  So it looks like this:
> >
> > id      date
> > 1       1000
> > 1       1002
> > 1       1002
> > 1       1200
> > 1       1250
> > 2       1050
> > 2       1059
> > 2       1085
> >
> > ...etc.
> >
> >
> > The question is, how to construct a variable that counts 
> the number of 
> > observations that an individual (id) appears in the dataset 
> up to 30 
> > days previously.  If we call the variable count30, it would 
> look like
> > this:
> >
> > id      date    count30
> > 1       1000            1
> > 1       1002            2
> > 1       1002            3
> > 1       1200            1
> > 1       1250            1
> > 2       1050            1
> > 2       1059            2
> > 2       1085            2
> >
> > ...etc.
> >
> > I suspect there's an easy way of doing this, but the only 
> ways I could 
> > think of involved brute force looping through observations.
> >
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


-- 
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index