# st: RE: Calculate rolling mean using last observation in a group

 From "Nick Cox" To Subject st: RE: Calculate rolling mean using last observation in a group Date Tue, 12 Aug 2008 13:51:17 +0100

```If I understand your question correctly, it yields to a minor variation
on the technique documented in

SJ-7-3  pr0033  . . . . . . . . . . . . . .  Stata tip 51: Events in
intervals
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
Q3/07   SJ 7(3):440--443                                 (no
commands)
tip for counting or summarizing irregularly spaced
events in intervals

<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.0808/date/article-20.html>

local N = _N
gsort Company_No - Date
gen avex = .
qui forval i = 1/`N' {
egen tag = tag(Investor_id) ///
if Company_no == Company_no[`i'] & ///
inrange(Date, Date[`i'] - 90, Date[`i'])

su x if tag, meanonly

replace avex = r(mean) in `i'

drop tag
}

The obvious variations on the previous example are using -summarize,
meanonly- to get the mean and specifying the previous 90 days within an
-inrange()- condition. (This code includes the current date; modify
according to taste.)

The not-so-obvious variation is using -egen, tag()- to tag the _last_
relevant observation for each distinct Investor_id.

-egen, tag()- tags just one of any repetitions of each distinct value of
the variable specified. It's not documented, but the observation tagged
is in fact the _first_ such occurrence in the dataset in the current
sort order. Hence use -gsort- first to reverse date order in each panel,
so that tagging catches the last such occurrence in time.

Nick
n.j.cox@durham.ac.uk

Marietta Jones

I have a new problem related to the dataset I presented in my original
email (see below).

For each observation in the dataset, I would like to calculate the
mean of variable x for the same company_no on a rolling prior 90 day
basis.
The extra complexity arises by the fact that there may be more than
one observations of variable x for the same investor_id for a given
company_no.
I would only like to include the last observation per investor_id for
a given company_no in the calculation of the mean.

>> I have the following dataset:
>>
>> Company_no     Date          Investor_id
>> 1                     03/01/2000          1
>> 1                     04/09/2000          1
>> 1                     12/12/2001          2
>> 2                     13/12/2000          4
>> 2                     07/08/2001          7
>> 3                     09/08/2000          4
>> 3                     19/03/2001          4
>> 3                     02/05/2001          5
>> 3                     03/12/2001          6
>>
>>
>> For each observation, I would like to calculate the number of
distinct
>> investors owning shares in a given company in the previous 12 months.
>> The new variable (No_investors_past_12_months) will look as follows:
>>
>> Company_no       Date        Investor_id
No_investors_past_12_months
>> 1                     03/01/2000         1                     1
>> 1                     04/09/2000         1                     1
>> 1                     12/12/2001         2                     1
>> 2                     13/12/2000         4                     1
>> 2                     07/08/2001         7                     2
>> 3                     09/08/2000         4                     1
>> 3                     19/03/2001         4                     1
>> 3                     02/05/2001         5                     2
>> 3                     03/12/2001         6                     3
>>
>> For example, for company 2 on 07/08/2001 there are two investors who
>> have owned shares of this company within the past 12 months.
>>
>> The idea is that the same investor should be counted only once and we
>> should count all observations over the past 12 months, including the
>> current observation date. If there are no observations over the past
>> 12 months, the value of the new variable should equal 1.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```