Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Variable running totals


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Variable running totals
Date   Fri, 1 Jun 2012 01:58:34 +0100

Looping over observations is easier than might be thought. A discussion at

SJ-7-3  pr0033  . . . . . . . . . . . . . .  Stata tip 51: Events in intervals
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q3/07   SJ 7(3):440--443                                 (no commands)
        tip for counting or summarizing irregularly spaced
        events in intervals

is accessible at http://www.stata-journal.com/sjpdf.html?articlenum=pr0033

In this case, consider

* sandpit

input id      date    count30
 1       1000            1
 1       1002            2
 1       1002            3
 1       1200            1
 1       1250            1
 2       1050            1
 2       1059            2
 2       1085            2
end

* solution code

gen mycount30 = .
qui forval i = 1/`=_N' {
	count if id == id[`i'] & inrange(date, date[`i'] - 30, date[`i'])
	replace mycount30 = r(N) in `i'
}

I suggest that this code is simpler than Jorge Eduardo's. Relative
efficiency will depend on the number of identifiers and the number of
observations (and, I suggest, on how long it takes to write code and
revise it for related problems).



Nick

On Thu, May 31, 2012 at 9:27 PM, Schaffer, Mark E <[email protected]> wrote:

> Hi all.  "Variable running totals" isn't the best description of the
> problem, but it's not too far off.
>
> A colleague has written to me with the following problem.  He has a
> panel dataset with two variables: id and date.  (He has some other
> variables but those are the two that matter.)  There may be multiple
> observations on id for a given date.  The date variable is in Stata %td
> format (#days after 01jan1960).  So it looks like this:
>
> id      date
> 1       1000
> 1       1002
> 1       1002
> 1       1200
> 1       1250
> 2       1050
> 2       1059
> 2       1085
>
> ...etc.
>
>
> The question is, how to construct a variable that counts the number of
> observations that an individual (id) appears in the dataset up to 30
> days previously.  If we call the variable count30, it would look like
> this:
>
> id      date    count30
> 1       1000            1
> 1       1002            2
> 1       1002            3
> 1       1200            1
> 1       1250            1
> 2       1050            1
> 2       1059            2
> 2       1085            2
>
> ...etc.
>
> I suspect there's an easy way of doing this, but the only ways I could
> think of involved brute force looping through observations.
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index