Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: group classification


From   Eric Booth <[email protected]>
To   "<[email protected]>" <[email protected]>
Subject   Re: st: Re: group classification
Date   Fri, 28 Jan 2011 19:45:24 +0000

<>


If I understand your post, you should be able to use -collapse- to get the 1 for any transaction in may 2006 to april 2007 if there was at least 1 transaction in the Nov 2006 to Apr 2007  target period for that company, so:
	collapse (max) prior6_*, by(compname year)
I've modified the code below to reflect the clarification in your latest post...the new code is marked with *comments*:

***************************! Begin Example
clear
inp str14(compname transdate)
"General Motors" "11/1/2006"
"Honda" "2/10/2006"
"Nissan" "10/10/2006"
"AAA" "11/05/2006"
"AAA" "12/07/2006"
"Honda" "1/08/2006"
"General Motors" "6/11/2007"
"AAA" "11/15/2007"
"General Motors" "12/21/2007"
"AAA" "2/21/2007"
"Nissan" "4/21/2007"
end

**setup**
ta compname
**fix dates**
g transdate2 = date(transdate, "MDY")
format transdate2 %td
ta transdate


**new**
g year = year(transdate2)
g month = month(transdate2)
drop transdate

forval y = 2005/2008 {
**how many days to look for?
loc days = date("04/30/`y'", "MDY") - date("11/01/`=`y'-1'", "MDY")
di "`days'"
loc auditdate = date("04/30/`y'", "MDY")
di %td `auditdate'
	g prior6_`y' = . 
	replace prior6_`y' = 1 if  ///
		(`auditdate'-transdate2)<=`days' & ///
		(`auditdate'-transdate2)>=0
	recode prior6_`y' (.=0) if !mi(transdate2) 
**bys compname: egen prior6_`y'_sum = sum(prior6_`y') 


**new**
**replace all values from that year with 1 if any transdate is within 6 mos:
bys compname year: egen m_`y' = max(prior6_`y') 
replace prior6_`y' = m_`y' if year==`y' & month<=4
replace prior6_`y' = m_`y' if year==`=`y'-1' & inrange(month, 5, 12)
drop m_`y'
		}
l compname transdate prior6_2007*, sepby(compname) noobs


**based on your description, I think this is where you're going-->
collapse (max) prior6_*, by(compname year)
l
***************************! End Example

- Eric

__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754

On Jan 27, 2011, at 4:59 PM, Jurgen Sidgman wrote:

>> Eric,
> 
> Thank you much for that code. However, your code has the same problem that mine has (although your is much much nicer).
> Looking at your companies and transactions, GM has three observations:
> nov 1, 2006
> jun 11, 2007
> dec 21, 2007
> 
> Your code looks at the data and categorizes the transaction in nov and dec with a 1. However, it fails to recognize that june although outside the nov-apr period still needs to be coded with a 1. If not, when I create the category of companies without transactions in the previous 6 months (beginning on April and searching backwards), I will place GM in both categories. The category without transactions (zero) should hold companies that only have transactions between may and october. A company coded with a 1 can have transactions at any time during a year if at least one transaction exist between november of the previous year and april of the current year. Of course the process is rebalanced every year.
> 
> For example:
> At the end of april of 2007 GM will be coded 1 because of the transaction it has on nov. 2006
> At the end of april of 2008 GM will be coded 1 again because of the transaction it has on dec. 2007
> 
> As long as these two transactions exist, other transactions within these periods are irrelevant for classification purposes.
> 
> On the other hand, let's say that GM executes 6 transactions in 2005. One for each month between may and october.
> At the end of april of 2006 GM will be coded 0, assuming it has no transactions between nov. 01, 2005 and apr. 30, 2006.
> 
> Sorry if I did not explain well myself before.
> 
> Thanks a lot!
> 
> Jurgen
> 
> 
>> - -----Original Message-----
>> 
>> From:  Jurgen Sidgman <[email protected]>
>> Subj:  st: group classification
>> Date:  Wed Jan 26, 2011 4:33 pm
>> Size:  1K
>> To:  [email protected]
>> 
>> Hi all,
>> 
>> I have what I thought it was a simple problem but has turned out to be a head age. I have a dataset of company names and transaction dates. What I need to do with it is as follows:
>> For each company, at the end of every april of each year, I want to verify that in the prior six months the company has at least one transaction. If the company has a transaction between november of the prior year and April of the current year I code it with a 1. If the company does not have a transaction within that period, regardless of having transactions in other months (lets say september) I code it with a 0.
>> 
>> I keep getting false positives because I have companies that have transactions in lets' say november and december of last year, no transactions on this year, and again transactions in january of next year. Then, what i get is that i am able to set to one last year and the next year, but I get a zero on the current year, which is not appropriate.
>> 
>> My code follows
>> . use "dataset.dta", clear
>> . drop if month(transdate) >= 5 & month(transdate) <= 10
>> . drop if month(transdate) < 11 & year(transdate) == 2004
>> . drop if month(transdate) > 4 & year(transdate) == 2008
>> . save "cross_section.dta", replace
>> . gen transyear = year(transdate)
>> . Keep compname transyear
>> . duplicates drop
>> . gen prior6 = 1
>> . snapshot save, label("Snapshot 1")
>> 
>> . use "dataset.dta", clear
>> . drop if month(transdate) < 11 & year(transdate) == 2004
>> . drop if month(transdate) > 4 & year(transdate) == 2008
>> . gen transyear = year(transdate)
>> . save "cross_section.dta", replace
>> 
>> . merge 1:m cusip form4 transyear using "cross_section.dta"
>> 
>> Obviously merging using the year in the key does not work, but I am out of ideas for the moment. Can anyone help?
>> 
>> Thank you!
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> - --- message truncated ---
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> ------------------------------
>> 
>> Date: Thu, 27 Jan 2011 01:14:38 +0000
>> From: Eric Booth <[email protected]>
>> Subject: Re: st: group classification
>> 
>> <>
>> 
>> Here's a non-merge based approach:
>> 
>> ***************************! Begin Example
>> clear
>> inp str14(compname transdate)
>> "General Motors" "11/1/2006"
>> "Honda" "2/10/2006"
>> "Nissan" "10/10/2006"
>> "AAA" "11/05/2006"
>> "AAA" "12/07/2006"
>> "Honda" "1/08/2006"
>> "General Motors" "6/11/2007"
>> "AAA" "11/15/2007"
>> "General Motors" "12/21/2007"
>> "AAA" "2/21/2007"
>> "Nissan" "4/21/2007"
>> end
>> 
>> **setup**
>> ta compname
>> **fix dates**
>> g transdate2 = date(transdate, "MDY")
>> format transdate2 %td
>> ta transdate
>> 
>> /* OP: For each company, at the end 
>> of every april of each year, I want 
>> to verify that in the prior six months
>> the company has at least one transaction.
>> If the company has a transaction between
>> november of the prior year and April of 
>> the current year I code it with a 1. If
>> the company does not have a transaction 
>> within that period, regardless of having
>> transactions in other months I code it with a 0.
>> */
>> 
>> forval y = 2005/2008 {
>> **how many days to look for?
>> loc days = date("04/30/`y'", "MDY") - date("11/01/`=`y'-1'", "MDY")
>> di "`days'"
>> loc auditdate = date("04/30/`y'", "MDY")
>> di %td `auditdate'
>> 	g prior6_`y' = . 
>> 	replace prior6_`y' = 1 if  ///
>> 		(`auditdate'-transdate2)<=`days' & ///
>> 		(`auditdate'-transdate2)>=0
>> 	recode prior6_`y' (.=0) if !mi(transdate2) 
>> /* uncomment/modify line below to recode to missing 
>> 	if notin 2 year period around auditdate  */
>> **recode prior6_`y' (0=.) if !inlist(year(transdate2), `y', `=`y'-1')
>> **to get counts by compname for each auditdate, you could:
>> bys compname: egen prior6_`y'_sum = sum(prior6_`y')
>> 		}
>> l compname transdate prior6_2007*, sepby(compname) noobs
>> ***************************! End Example
>> 
>> - - Eric
>> 
>> __
>> Eric A. Booth
>> Public Policy Research Institute
>> Texas A&M University
>> [email protected]
>> Office: +979.845.6754
> 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index