Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Data management _flag on the basis of the frequency


From   Eric Booth <eric.a.booth@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Data management _flag on the basis of the frequency
Date   Tue, 13 Mar 2012 09:47:55 -0500

<>


Since you've got repeat years (see below) that could be the last year that you want to mark with a 1 -- you need to decide if there is some kind of decision rule for which observation to mark.  So, for director "7" in month "4" your last year is 2007, but there are several observations.  You need to decide if there is some other variable that determines which is the last observation or perhaps -collapse- the data in some way to remove this duplication.  Without making any corrections for this issue, here is an example of how to mark your observations of interest.

****************!
clear
**data ex. shortened a bit:===>
inp str15 fulldate	firm_id	director_id	mark
"22/10/2002"	1	7	0
"07/04/2003"	1	7	0
"11/04/2003"	1	7	0
"01/10/2003"	1	7	0
"01/10/2003"	1	7	0
"20/10/2003"	1	7	0
"07/04/2004"	1	7	0
"08/04/2004"	1	7	0
"16/04/2004"	1	7	0
"25/10/2004"	1	7	1
"12/11/2004"	1	7	0
"07/04/2005"	1	7	1
"27/04/2005"	1	7	1
"25/08/2005"	1	7	0
"05/09/2005"	1	7	0
"12/09/2005"	1	7	0
"25/10/2005"	1	7	1
"25/04/2006"	1	7	1
"25/04/2006"	1	7	1
"05/09/2006"	1	7	0
"05/09/2006"	1	7	0
"28/09/2006"	1	7	0
"28/09/2006"	1	7	0
"28/09/2006"	1	7	0
"28/09/2006"	1	7	0
"28/09/2006"	1	7	0
"25/10/2006"	1	7	1
"27/04/2007"	1	7	1
"27/04/2007"	1	7	1
"06/09/2007"	1	7	1
"06/09/2007"	1	7	1
"06/09/2007"	1	7	1
"06/09/2007"	1	7	1
end
drop mark

**format dates
*see -help dates-
g date = date(fulldate, "DMY")
format date %td
g month = month(date)
g year = year(date)


**create mark2
sort director_id  month  year
bys director_id  month : g mark2 = 1 if _n== _N
recode mark2 (.=0)

**example:
l if director_id == 7 & month ==4 //note the repeat years within april for director "7"
****************!

- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
ebooth@ppri.tamu.edu
Office: +979.845.6754


On Mar 13, 2012, at 7:55 AM, Agnieszka Trzeciakiewicz wrote:

> Hi!
> I'm trying to mark an occurrence of an event in a particular firm, made by a
> particular director.
> 
> The idea is to mark an occurrence with number 1, if there is an available
> record in the same month over the past three consecutive years.
> For instance if for a director A there is a record in 09/2000, 09/2001, and
> 09/2002 I would like to assign year 2000 with mark 0, year 2001 with mark 0
> and 2002 with mark 1.
> 
> I have used Excel to solve my problem, and applied a relevant long formula .
> However, Excel broke down each time I run it. The original file has over
> 50000 records. Please find the table below.
> 
> Is it possible to create ''mark'' column using STATA software and its data
> management codes?
> Thanks for your help.
> Best wishes,
> Agnieszka
> 
> Table:
> full date	firm_id	director_id	mark
> 22/10/2002	1	7	0
> 07/04/2003	1	7	0
> 11/04/2003	1	7	0
> 01/10/2003	1	7	0
> 01/10/2003	1	7	0
> 20/10/2003	1	7	0
> 07/04/2004	1	7	0
> 08/04/2004	1	7	0
> 16/04/2004	1	7	0
> 25/10/2004	1	7	1
> 12/11/2004	1	7	0
> 07/04/2005	1	7	1
> 27/04/2005	1	7	1
> 25/08/2005	1	7	0
> 05/09/2005	1	7	0
> 12/09/2005	1	7	0
> 25/10/2005	1	7	1
> 25/04/2006	1	7	1
> 25/04/2006	1	7	1
> 05/09/2006	1	7	0
> 05/09/2006	1	7	0
> 28/09/2006	1	7	0
> 28/09/2006	1	7	0
> 28/09/2006	1	7	0
> 28/09/2006	1	7	0
> 28/09/2006	1	7	0
> 25/10/2006	1	7	1
> 27/04/2007	1	7	1
> 27/04/2007	1	7	1
> 06/09/2007	1	7	1
> 06/09/2007	1	7	1
> 06/09/2007	1	7	1
> 06/09/2007	1	7	1
> 06/09/2007	1	7	1
> 26/10/2007	1	7	1
> 01/11/2007	1	7	0
> 31/03/2008	1	7	0
> 31/03/2008	1	7	0
> 24/04/2008	1	7	1
> 24/04/2008	1	7	1
> 03/09/2008	1	7	1
> 03/09/2008	1	7	1
> 16/09/2008	1	7	1
> 14/10/2008	1	7	1
> 07/04/2009	1	7	1
> 07/04/2009	1	7	1
> 07/09/2009	55	7	0
> 07/09/2009	55	7	0
> 07/09/2009	55	7	0
> 23/09/2009	55	7	0
> 23/09/2009	55	7	0
> 23/09/2009	1	7	1
> 08/10/2009	1	7	1
> 22/03/2010	1	7	0
> 22/03/2010	1	7	0
> 23/03/2010	1	7	0
> 23/03/2010	1	7	0
> 23/03/2010	1	7	0
> 12/04/2010	1	7	1
> 07/09/2010	1	7	1
> 07/09/2010	1	7	1
> 07/09/2010	1	7	1
> 07/09/2010	1	7	1
> 07/09/2010	1	7	1
> 10/09/2010	1	7	1
> 06/10/2010	1	7	1
> 27/05/2009	2	8	0
> 09/12/2009	2	8	0
> 05/02/2010	2	8	0
> 20/05/2010	2	8	0
> 23/08/2010	2	8	0
> 22/04/2008	3	10	0
> 18/12/2008	3	10	0
> 24/04/2009	3	10	0
> 24/04/2009	3	10	0
> 19/03/2010	3	10	0
> 19/03/2010	3	10	0
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index