[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Data Management Issue
Rebecca Pope <firstname.lastname@example.org>
Re: st: Data Management Issue
Sat, 9 Mar 2013 18:54:07 -0600
Reducing the problem somewhat, let's say your data looks like this:
// Set 1 (events, by definition only includes observations for which
there was a default)
input firmid industry yeardefault
1 4 2007
6 10 1998
rename firmid compid // to emphasize that you will need to rename this
variable in your data
save `"`event'"', replace
// Set 2 (universe - all firms, all years, all industries)
set obs 9
gen firmid = _n
gen industry = 4 in 1/3
replace industry = 10 in 4/6
replace industry = 18 in 7/9
expandcl 15, cluster(firmid industry) generate(newid)
bys firmid: gen year = _n+1994 // assuming your data runs from 1995 to 2009
merge m:1 industry using `"`event'"', nogen
// find +/- 2 years by industry; you're only interested in competitor behavior
gen insamp = firmid!=compid &
inrange(year,yeardefault-2,yeardefault+2) & !missing(yeardefault)
// "simple" data (you still have records for all firms, a pre/post
indicator only for relevant observations)
gen post = (year > yeardefault) if insamp
*** end example ***
This is a _very_ general sketch of a solution. You should be able to
customize it to your specific data needs though. The biggest
complication I foresee is around multiple defaults in the same
industry, but without knowing the extent to which that happens, I
can't give you a good solution for that now. (Read: do not change the
code to -merge m:m-; the modifications will need to be a bit more than
On Sat, Mar 9, 2013 at 5:53 PM, thomas bourveau
> Dear Statalisters,
> I'm struggling a little bit with some data management issues.
> I want to test if firms change their behavior after one of their
> competitors defaulted. Specifically, I want to compare some risk
> taking measures in the two years before to the two years after this
> specific event occurred within an industry.
> I currently have two datasets:
> Dataset 1: The Event dataset
> In this dataset, I have around 1,000 observations of default events
> over 15 years for private companies. For each case, I have the
> identifier of the firm, an industry indicator and the year of
> occurrence of the event.
> Dataset 2: The Universe dataset
> In this dataset, I have computed my measures of risk taking and some
> controls variables for the entire universe of firms. I also have a
> firm id, a year indicator and an industry indicator.
> 1. First, let's assume that I want to follow a simple difference
> approach (i.e not building a counterfactual control sample). For each
> event, I need to "label" all observations in the industry as "Pre" in
> the years t and t-1 and "Post" in the years t+1 and t+2.
> My idea was to create a separate file for each of my 1,000 events.
> Then I would merge it with the dataset number 2 and keep only
> observations within the industry in the time period of interest.
> Finally, I would append all the 1,000 files together to obtain my
> final dataset.
> Does it seems right to you ? Does anyone has an idea on a quicker /
> better way to do it ?
> 2. My concern is that if now I want to build a control group based on
> an industry with similar characteristics, my approach will not allow
> me to find an industry with no events.
> Does anyone have an idea ?
> Thanks in advance
> Thomas Bourveau
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: