Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Jurgen Sidgman <[email protected]> |

To |
[email protected] |

Subject |
st: Re: group classification |

Date |
Thu, 27 Jan 2011 16:59:34 -0600 |

> Eric, Thank you much for that code. However, your code has the same problem that mine has (although your is much much nicer). Looking at your companies and transactions, GM has three observations: nov 1, 2006 jun 11, 2007 dec 21, 2007 Your code looks at the data and categorizes the transaction in nov and dec with a 1. However, it fails to recognize that june although outside the nov-apr period still needs to be coded with a 1. If not, when I create the category of companies without transactions in the previous 6 months (beginning on April and searching backwards), I will place GM in both categories. The category without transactions (zero) should hold companies that only have transactions between may and october. A company coded with a 1 can have transactions at any time during a year if at least one transaction exist between november of the previous year and april of the current year. Of course the process is rebalanced every year. For example: At the end of april of 2007 GM will be coded 1 because of the transaction it has on nov. 2006 At the end of april of 2008 GM will be coded 1 again because of the transaction it has on dec. 2007 As long as these two transactions exist, other transactions within these periods are irrelevant for classification purposes. On the other hand, let's say that GM executes 6 transactions in 2005. One for each month between may and october. At the end of april of 2006 GM will be coded 0, assuming it has no transactions between nov. 01, 2005 and apr. 30, 2006. Sorry if I did not explain well myself before. Thanks a lot! Jurgen > - -----Original Message----- > > From: Jurgen Sidgman <[email protected]> > Subj: st: group classification > Date: Wed Jan 26, 2011 4:33 pm > Size: 1K > To: [email protected] > > Hi all, > > I have what I thought it was a simple problem but has turned out to be a head age. I have a dataset of company names and transaction dates. What I need to do with it is as follows: > For each company, at the end of every april of each year, I want to verify that in the prior six months the company has at least one transaction. If the company has a transaction between november of the prior year and April of the current year I code it with a 1. If the company does not have a transaction within that period, regardless of having transactions in other months (lets say september) I code it with a 0. > > I keep getting false positives because I have companies that have transactions in lets' say november and december of last year, no transactions on this year, and again transactions in january of next year. Then, what i get is that i am able to set to one last year and the next year, but I get a zero on the current year, which is not appropriate. > > My code follows > . use "dataset.dta", clear > . drop if month(transdate) >= 5 & month(transdate) <= 10 > . drop if month(transdate) < 11 & year(transdate) == 2004 > . drop if month(transdate) > 4 & year(transdate) == 2008 > . save "cross_section.dta", replace > . gen transyear = year(transdate) > . Keep compname transyear > . duplicates drop > . gen prior6 = 1 > . snapshot save, label("Snapshot 1") > > . use "dataset.dta", clear > . drop if month(transdate) < 11 & year(transdate) == 2004 > . drop if month(transdate) > 4 & year(transdate) == 2008 > . gen transyear = year(transdate) > . save "cross_section.dta", replace > > . merge 1:m cusip form4 transyear using "cross_section.dta" > > Obviously merging using the year in the key does not work, but I am out of ideas for the moment. Can anyone help? > > Thank you! > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > - --- message truncated --- > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > ------------------------------ > > Date: Thu, 27 Jan 2011 01:14:38 +0000 > From: Eric Booth <[email protected]> > Subject: Re: st: group classification > > <> > > Here's a non-merge based approach: > > ***************************! Begin Example > clear > inp str14(compname transdate) > "General Motors" "11/1/2006" > "Honda" "2/10/2006" > "Nissan" "10/10/2006" > "AAA" "11/05/2006" > "AAA" "12/07/2006" > "Honda" "1/08/2006" > "General Motors" "6/11/2007" > "AAA" "11/15/2007" > "General Motors" "12/21/2007" > "AAA" "2/21/2007" > "Nissan" "4/21/2007" > end > > **setup** > ta compname > **fix dates** > g transdate2 = date(transdate, "MDY") > format transdate2 %td > ta transdate > > /* OP: For each company, at the end > of every april of each year, I want > to verify that in the prior six months > the company has at least one transaction. > If the company has a transaction between > november of the prior year and April of > the current year I code it with a 1. If > the company does not have a transaction > within that period, regardless of having > transactions in other months I code it with a 0. > */ > > forval y = 2005/2008 { > **how many days to look for? > loc days = date("04/30/`y'", "MDY") - date("11/01/`=`y'-1'", "MDY") > di "`days'" > loc auditdate = date("04/30/`y'", "MDY") > di %td `auditdate' > g prior6_`y' = . > replace prior6_`y' = 1 if /// > (`auditdate'-transdate2)<=`days' & /// > (`auditdate'-transdate2)>=0 > recode prior6_`y' (.=0) if !mi(transdate2) > /* uncomment/modify line below to recode to missing > if notin 2 year period around auditdate */ > **recode prior6_`y' (0=.) if !inlist(year(transdate2), `y', `=`y'-1') > **to get counts by compname for each auditdate, you could: > bys compname: egen prior6_`y'_sum = sum(prior6_`y') > } > l compname transdate prior6_2007*, sepby(compname) noobs > ***************************! End Example > > - - Eric > > __ > Eric A. Booth > Public Policy Research Institute > Texas A&M University > [email protected] > Office: +979.845.6754 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Re: group classification***From:*Eric Booth <[email protected]>

- Prev by Date:
**Antwort: st: Heckprob error: outcome = y_select > 0 predicts data perfectly** - Next by Date:
**st: Random sorting** - Previous by thread:
**st: Heckprob error: outcome = y_select > 0 predicts data perfectly** - Next by thread:
**Re: st: Re: group classification** - Index(es):