Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Data Management Issue

From   Rebecca Pope <>
Subject   Re: st: Data Management Issue
Date   Sat, 9 Mar 2013 18:54:07 -0600

Reducing the problem somewhat, let's say your data looks like this:

// Set 1 (events, by definition only includes observations for which
there was a default)
version 12
input firmid industry yeardefault
1 4 2007
6 10 1998
rename firmid compid // to emphasize that you will need to rename this
variable in your data
tempfile event
save `"`event'"', replace

// Set 2 (universe - all firms, all years, all industries)
set obs 9
gen firmid = _n
gen industry = 4 in 1/3
replace industry = 10 in 4/6
replace industry = 18 in 7/9
expandcl 15, cluster(firmid industry) generate(newid)

bys firmid: gen year = _n+1994  // assuming your data runs from 1995 to 2009
merge m:1 industry using `"`event'"', nogen

// find +/- 2 years by industry; you're only interested in competitor behavior
gen insamp = firmid!=compid &
inrange(year,yeardefault-2,yeardefault+2) & !missing(yeardefault)

// "simple" data (you still have records for all firms, a pre/post
indicator only for relevant observations)
gen post = (year > yeardefault) if insamp

 *** end example ***

This is a _very_ general sketch of a solution. You should be able to
customize it to your specific data needs though. The biggest
complication I foresee is around multiple defaults in the same
industry, but without knowing the extent to which that happens, I
can't give you a good solution for that now. (Read: do not change the
code to -merge m:m-; the modifications will need to be a bit more than


On Sat, Mar 9, 2013 at 5:53 PM, thomas bourveau
<> wrote:
> Dear Statalisters,
> I'm struggling a little bit with some data management issues.
> I want to test if firms change their behavior after one of their
> competitors defaulted. Specifically, I want to compare some risk
> taking measures in the two years before to the two years after this
> specific event occurred within an industry.
> I currently have two datasets:
> Dataset 1: The Event dataset
> In this dataset, I have around 1,000 observations of default events
> over 15 years for private companies. For each case, I have the
> identifier of the firm, an industry indicator and the year of
> occurrence of the event.
> Dataset 2: The Universe dataset
> In this dataset, I have computed my measures of risk taking and some
> controls variables for the entire universe of firms. I also have a
> firm id, a year indicator and an industry indicator.
> 1. First, let's assume that I want to follow a simple difference
> approach (i.e not building a counterfactual control sample). For each
> event, I need to "label" all observations in the industry as "Pre" in
> the years t and t-1 and "Post" in the years t+1 and t+2.
> My idea was to create a separate file for each of my 1,000 events.
> Then I would merge it with the dataset number 2 and keep only
> observations within the industry  in the time period of interest.
> Finally, I would append all the 1,000 files together to obtain my
> final dataset.
> Does it seems right to you ? Does anyone has an idea on a quicker /
> better way to do it ?
> 2. My concern is that if now I want to build a control group based on
> an industry with similar characteristics, my approach will not allow
> me to find an industry with no events.
> Does anyone have an idea ?
> Thanks in advance
> Best,
> Thomas
> --
> Thomas Bourveau
> 0637573925
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index