Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Count no of unique occurrences on a prior-12-month basis


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Count no of unique occurrences on a prior-12-month basis
Date   Fri, 1 Aug 2008 17:19:09 +0100

It's a bit naughty posting the same question three times. It was clear
first time, just that nobody was able and willing to answer before now. 

I don't understand the last sentence ("If there are no observations over
the past
12 months, the value of the new variable should equal 1"), but otherwise
here goes: 

This problem is the subject of a Stata Journal Tip in which the
associated ideas are discussed at excruciating length. 

SJ-7-3  pr0033  . . . . . . . . . . . . . .  Stata tip 51: Events in
intervals
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q3/07   SJ 7(3):440--443                                 (no
commands)
        tip for counting or summarizing irregularly spaced
        events in intervals

The basic idea explored in that Tip is, however, quite simple. 

We want to count how many investors held stock in that company in the
previous 12 months. The Tip analyses it something like this: 

Focus on observation 1:

count if Company_no == Company_no[1] & inrange(Date, Date[1] - 365,
Date[1]) 

To extend this to all observations, we loop over observations and make
sure that we store values as they emerge from a -count-: 

local N = _N 
gen count = . 
qui forval i = 1/`N' { 
	count if Company_no == Company_no[`i'] & inrange(Date, Date[`i']
- 365, Date[`i'])
	replace count = r(N) in `i' 
} 

Except that that doesn't count each investor just once, as stipulated. 

Here's a way of building that in: 

local N = _N 
gen count = . 
qui forval i = 1/`N' { 
	egen tag = tag(Investor_id) ///
	if Company_no == Company_no[`i'] & ///
	inrange(Date, Date[`i'] - 365, Date[`i'])

	count if tag 

	replace count = r(N) in `i'

	drop tag 
}

Warning: untested code. If the precise definition here of previous year
doesn't suit, then you need to change it. 

If this problem interests you, here is more of the same: 

SJ-7-4  dm0033  . . . . . . Speaking Stata: Counting groups, especially
panels
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q4/07   SJ 7(4):571--581                                 (no
commands)
        discusses how to count panels through reduction commands
        or through tabulation commands and how to overcome
        problems that do not yield easily to these approaches

SJ-7-1  pr0029  . . . . . . . . . . . . . . .  Speaking Stata: Making it
count
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q1/07   SJ 7(1):117--130                                 (no
commands)
        discusses count used with a loop over observations
        or variables

Nick
[email protected] 

Marietta Jones

I have the following dataset:

Company_no     Date          Investor_id
1                     03/01/2000          1
1                     04/09/2000          1
1                     12/12/2001          2
2                     13/12/2000          4
2                     07/08/2001          7
3                     09/08/2000          4
3                     19/03/2001          4
3                     02/05/2001          5
3                     03/12/2001          6


For each observation, I would like to calculate the number of distinct
investors owning shares in a given company in the previous 12 months.
The new variable (No_investors_past_12_months) will look as follows:

Company_no       Date        Investor_id   No_investors_past_12_months
1                     03/01/2000         1                     1
1                     04/09/2000         1                     1
1                     12/12/2001         2                     1
2                     13/12/2000         4                     1
2                     07/08/2001         7                     2
3                     09/08/2000         4                     1
3                     19/03/2001         4                     1
3                     02/05/2001         5                     2
3                     03/12/2001         6                     3

For example, for company 2 on 07/08/2001 there are two investors who
have owned shares of this company within the past 12 months.

The idea is that the same investor should be counted only once and we
should count all observations over the past 12 months, including the
current observation date. If there are no observations over the past
12 months, the value of the new variable should equal 1.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index