Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Data management _flag on the basis of the frequency


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Data management _flag on the basis of the frequency
Date   Tue, 13 Mar 2012 19:30:20 +0000

Here's another way to approach it. I assume the existence of a daily
date variable -date-.

gen flag = 0
gen possible = 0

quietly forval i = 1/`=_N' {
       replace possible = firm_id == firm_id[`i'] & director ==
director[`i'] & inrange(year(date), year(date[`i']), year(date[`i']) -
2) & month(date) == month(date[`i'])

       tab year if possible
       if r(r) == 3 replace flag = 1 in `i'
}

The idea is a loop over observations. The rules are (a) same firm (b)
same director (c) this year or previous two (d) same month. That
defines possibles. We still need to check that all three years are
represented in the possibles, which is true if a table of years has
precisely three rows.

This is the often despised "loop over observations". Occasionally it
does yield code that is rather direct if also rather messy. It is not
very fast, but it is often relatively quick to code.

Code not tested. For more discussion of the strategy, see

SJ-7-3  pr0033  . . . . . . . . . . . . . .  Stata tip 51: Events in intervals
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q3/07   SJ 7(3):440--443                                 (no commands)
        tip for counting or summarizing irregularly spaced
        events in intervals

which is accessible to all via the Stata Journal website.

However, the rules don't seem complete. If there is a record in
9/1998, 9/1999, 9/2000 shouldn't the last be flagged 1 similarly?

Nick

On Tue, Mar 13, 2012 at 12:55 PM, Agnieszka Trzeciakiewicz
<[email protected]> wrote:

> I'm trying to mark an occurrence of an event in a particular firm, made by a
> particular director.
>
> The idea is to mark an occurrence with number 1, if there is an available
> record in the same month over the past three consecutive years.
> For instance if for a director A there is a record in 09/2000, 09/2001, and
> 09/2002 I would like to assign year 2000 with mark 0, year 2001 with mark 0
> and 2002 with mark 1.
>
> I have used Excel to solve my problem, and applied a relevant long formula .
> However, Excel broke down each time I run it. The original file has over
> 50000 records. Please find the table below.
>
> Is it possible to create ''mark'' column using STATA software and its data
> management codes?
> Thanks for your help.
> Best wishes,
> Agnieszka
>
> Table:
> full date       firm_id director_id     mark
> 22/10/2002      1       7       0
> 07/04/2003      1       7       0
> 11/04/2003      1       7       0
> 01/10/2003      1       7       0
> 01/10/2003      1       7       0
> 20/10/2003      1       7       0
> 07/04/2004      1       7       0
> 08/04/2004      1       7       0
> 16/04/2004      1       7       0
> 25/10/2004      1       7       1
> 12/11/2004      1       7       0
> 07/04/2005      1       7       1
> 27/04/2005      1       7       1
> 25/08/2005      1       7       0
> 05/09/2005      1       7       0
> 12/09/2005      1       7       0
> 25/10/2005      1       7       1
> 25/04/2006      1       7       1
> 25/04/2006      1       7       1
> 05/09/2006      1       7       0
> 05/09/2006      1       7       0
> 28/09/2006      1       7       0
> 28/09/2006      1       7       0
> 28/09/2006      1       7       0
> 28/09/2006      1       7       0
> 28/09/2006      1       7       0
> 25/10/2006      1       7       1
> 27/04/2007      1       7       1
> 27/04/2007      1       7       1
> 06/09/2007      1       7       1
> 06/09/2007      1       7       1
> 06/09/2007      1       7       1
> 06/09/2007      1       7       1
> 06/09/2007      1       7       1
> 26/10/2007      1       7       1
> 01/11/2007      1       7       0
> 31/03/2008      1       7       0
> 31/03/2008      1       7       0
> 24/04/2008      1       7       1
> 24/04/2008      1       7       1
> 03/09/2008      1       7       1
> 03/09/2008      1       7       1
> 16/09/2008      1       7       1
> 14/10/2008      1       7       1
> 07/04/2009      1       7       1
> 07/04/2009      1       7       1
> 07/09/2009      55      7       0
> 07/09/2009      55      7       0
> 07/09/2009      55      7       0
> 23/09/2009      55      7       0
> 23/09/2009      55      7       0
> 23/09/2009      1       7       1
> 08/10/2009      1       7       1
> 22/03/2010      1       7       0
> 22/03/2010      1       7       0
> 23/03/2010      1       7       0
> 23/03/2010      1       7       0
> 23/03/2010      1       7       0
> 12/04/2010      1       7       1
> 07/09/2010      1       7       1
> 07/09/2010      1       7       1
> 07/09/2010      1       7       1
> 07/09/2010      1       7       1
> 07/09/2010      1       7       1
> 10/09/2010      1       7       1
> 06/10/2010      1       7       1
> 27/05/2009      2       8       0
> 09/12/2009      2       8       0
> 05/02/2010      2       8       0
> 20/05/2010      2       8       0
> 23/08/2010      2       8       0
> 22/04/2008      3       10      0
> 18/12/2008      3       10      0
> 24/04/2009      3       10      0
> 24/04/2009      3       10      0
> 19/03/2010      3       10      0
> 19/03/2010      3       10      0
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index