Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: group classification
From
Eric Booth <[email protected]>
To
"<[email protected]>" <[email protected]>
Subject
Re: st: Re: group classification
Date
Fri, 28 Jan 2011 19:45:24 +0000
<>
If I understand your post, you should be able to use -collapse- to get the 1 for any transaction in may 2006 to april 2007 if there was at least 1 transaction in the Nov 2006 to Apr 2007 target period for that company, so:
collapse (max) prior6_*, by(compname year)
I've modified the code below to reflect the clarification in your latest post...the new code is marked with *comments*:
***************************! Begin Example
clear
inp str14(compname transdate)
"General Motors" "11/1/2006"
"Honda" "2/10/2006"
"Nissan" "10/10/2006"
"AAA" "11/05/2006"
"AAA" "12/07/2006"
"Honda" "1/08/2006"
"General Motors" "6/11/2007"
"AAA" "11/15/2007"
"General Motors" "12/21/2007"
"AAA" "2/21/2007"
"Nissan" "4/21/2007"
end
**setup**
ta compname
**fix dates**
g transdate2 = date(transdate, "MDY")
format transdate2 %td
ta transdate
**new**
g year = year(transdate2)
g month = month(transdate2)
drop transdate
forval y = 2005/2008 {
**how many days to look for?
loc days = date("04/30/`y'", "MDY") - date("11/01/`=`y'-1'", "MDY")
di "`days'"
loc auditdate = date("04/30/`y'", "MDY")
di %td `auditdate'
g prior6_`y' = .
replace prior6_`y' = 1 if ///
(`auditdate'-transdate2)<=`days' & ///
(`auditdate'-transdate2)>=0
recode prior6_`y' (.=0) if !mi(transdate2)
**bys compname: egen prior6_`y'_sum = sum(prior6_`y')
**new**
**replace all values from that year with 1 if any transdate is within 6 mos:
bys compname year: egen m_`y' = max(prior6_`y')
replace prior6_`y' = m_`y' if year==`y' & month<=4
replace prior6_`y' = m_`y' if year==`=`y'-1' & inrange(month, 5, 12)
drop m_`y'
}
l compname transdate prior6_2007*, sepby(compname) noobs
**based on your description, I think this is where you're going-->
collapse (max) prior6_*, by(compname year)
l
***************************! End Example
- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
On Jan 27, 2011, at 4:59 PM, Jurgen Sidgman wrote:
>> Eric,
>
> Thank you much for that code. However, your code has the same problem that mine has (although your is much much nicer).
> Looking at your companies and transactions, GM has three observations:
> nov 1, 2006
> jun 11, 2007
> dec 21, 2007
>
> Your code looks at the data and categorizes the transaction in nov and dec with a 1. However, it fails to recognize that june although outside the nov-apr period still needs to be coded with a 1. If not, when I create the category of companies without transactions in the previous 6 months (beginning on April and searching backwards), I will place GM in both categories. The category without transactions (zero) should hold companies that only have transactions between may and october. A company coded with a 1 can have transactions at any time during a year if at least one transaction exist between november of the previous year and april of the current year. Of course the process is rebalanced every year.
>
> For example:
> At the end of april of 2007 GM will be coded 1 because of the transaction it has on nov. 2006
> At the end of april of 2008 GM will be coded 1 again because of the transaction it has on dec. 2007
>
> As long as these two transactions exist, other transactions within these periods are irrelevant for classification purposes.
>
> On the other hand, let's say that GM executes 6 transactions in 2005. One for each month between may and october.
> At the end of april of 2006 GM will be coded 0, assuming it has no transactions between nov. 01, 2005 and apr. 30, 2006.
>
> Sorry if I did not explain well myself before.
>
> Thanks a lot!
>
> Jurgen
>
>
>> - -----Original Message-----
>>
>> From: Jurgen Sidgman <[email protected]>
>> Subj: st: group classification
>> Date: Wed Jan 26, 2011 4:33 pm
>> Size: 1K
>> To: [email protected]
>>
>> Hi all,
>>
>> I have what I thought it was a simple problem but has turned out to be a head age. I have a dataset of company names and transaction dates. What I need to do with it is as follows:
>> For each company, at the end of every april of each year, I want to verify that in the prior six months the company has at least one transaction. If the company has a transaction between november of the prior year and April of the current year I code it with a 1. If the company does not have a transaction within that period, regardless of having transactions in other months (lets say september) I code it with a 0.
>>
>> I keep getting false positives because I have companies that have transactions in lets' say november and december of last year, no transactions on this year, and again transactions in january of next year. Then, what i get is that i am able to set to one last year and the next year, but I get a zero on the current year, which is not appropriate.
>>
>> My code follows
>> . use "dataset.dta", clear
>> . drop if month(transdate) >= 5 & month(transdate) <= 10
>> . drop if month(transdate) < 11 & year(transdate) == 2004
>> . drop if month(transdate) > 4 & year(transdate) == 2008
>> . save "cross_section.dta", replace
>> . gen transyear = year(transdate)
>> . Keep compname transyear
>> . duplicates drop
>> . gen prior6 = 1
>> . snapshot save, label("Snapshot 1")
>>
>> . use "dataset.dta", clear
>> . drop if month(transdate) < 11 & year(transdate) == 2004
>> . drop if month(transdate) > 4 & year(transdate) == 2008
>> . gen transyear = year(transdate)
>> . save "cross_section.dta", replace
>>
>> . merge 1:m cusip form4 transyear using "cross_section.dta"
>>
>> Obviously merging using the year in the key does not work, but I am out of ideas for the moment. Can anyone help?
>>
>> Thank you!
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>> - --- message truncated ---
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> ------------------------------
>>
>> Date: Thu, 27 Jan 2011 01:14:38 +0000
>> From: Eric Booth <[email protected]>
>> Subject: Re: st: group classification
>>
>> <>
>>
>> Here's a non-merge based approach:
>>
>> ***************************! Begin Example
>> clear
>> inp str14(compname transdate)
>> "General Motors" "11/1/2006"
>> "Honda" "2/10/2006"
>> "Nissan" "10/10/2006"
>> "AAA" "11/05/2006"
>> "AAA" "12/07/2006"
>> "Honda" "1/08/2006"
>> "General Motors" "6/11/2007"
>> "AAA" "11/15/2007"
>> "General Motors" "12/21/2007"
>> "AAA" "2/21/2007"
>> "Nissan" "4/21/2007"
>> end
>>
>> **setup**
>> ta compname
>> **fix dates**
>> g transdate2 = date(transdate, "MDY")
>> format transdate2 %td
>> ta transdate
>>
>> /* OP: For each company, at the end
>> of every april of each year, I want
>> to verify that in the prior six months
>> the company has at least one transaction.
>> If the company has a transaction between
>> november of the prior year and April of
>> the current year I code it with a 1. If
>> the company does not have a transaction
>> within that period, regardless of having
>> transactions in other months I code it with a 0.
>> */
>>
>> forval y = 2005/2008 {
>> **how many days to look for?
>> loc days = date("04/30/`y'", "MDY") - date("11/01/`=`y'-1'", "MDY")
>> di "`days'"
>> loc auditdate = date("04/30/`y'", "MDY")
>> di %td `auditdate'
>> g prior6_`y' = .
>> replace prior6_`y' = 1 if ///
>> (`auditdate'-transdate2)<=`days' & ///
>> (`auditdate'-transdate2)>=0
>> recode prior6_`y' (.=0) if !mi(transdate2)
>> /* uncomment/modify line below to recode to missing
>> if notin 2 year period around auditdate */
>> **recode prior6_`y' (0=.) if !inlist(year(transdate2), `y', `=`y'-1')
>> **to get counts by compname for each auditdate, you could:
>> bys compname: egen prior6_`y'_sum = sum(prior6_`y')
>> }
>> l compname transdate prior6_2007*, sepby(compname) noobs
>> ***************************! End Example
>>
>> - - Eric
>>
>> __
>> Eric A. Booth
>> Public Policy Research Institute
>> Texas A&M University
>> [email protected]
>> Office: +979.845.6754
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/