Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# re: st: Econometrics question

 From Christopher Baum To statalist@hsphsun2.harvard.edu Subject re: st: Econometrics question Date Mon, 29 Mar 2010 16:44:07 -0400

```<>

I have a basic econometric question and I'm hoping you can help me out. I am running a regression of bond spreads on various variables denoting domestic economic conditions, and country fixed effects; I'm clustering my standard errors by quarter, e.g.

xi: regress LogSpread GDPgrowth DebtToGDP i.country, cluster(time)

I have quarterly data for 40 different countries, although it's a very unbalanced panel because the spread of the bond is for new bond issues and a lot of countries don't issue new bonds every quarter. So, the data would look something like this:

Argentina 1991q1 400    3.0
Argentina 1994q4 450    2.5
Argentina 2001q3 800    0.7
Brazil    1993q2 ...
Brazil    1993q4 ...
Brazil    1994q1 ...
Colombia ...
...

When I run a simple regression like the one above for the full sample, I obtain a coefficient for GDPgrowth of -0.073***

Then if I run this same regression for two separate subsamples for the years 1991-1997 and 1998-2006, my coefficients for GDPgrowth are -0.056 and 0.009, both insignificant.

In my experience, the full sample coefficient would in general be some sort of weighted average of the two coefficients obtained from subsample regressions. So, I don't understand why this is not the case here...

The number of observations in the two subsamples add up to the number of observations in the full sample estimations.

Other posters suggested that this might be due to the presence of additional explanatory variables in your model, noting that your intuition should hold in the context of a univariate (y on x) regression. But in your case you are essentially running a fixed-effects panel regression, with country fixed effects. The correlation between variations in GDP growth around its country-specific mean and variations in LogSpread around its country-specific mean may well have shifted over time. In a very unbalanced panel, I would guess that the number of observations per country might be very small, so the demeaning (within) transformation might be introducing quite a bit of noise when applied to the shorter sample. Why don't you try running

xtreg LogSpread GDPgrowth DebtToGDP, fe cluster(time)   for the full sample and for the two subsamples. You should get the same coefficients, but will also get some information about the amount of data used to compute the within estimates.

You also might want to consider adding time effects, which I expect are likely to be highly significant in these data. That is, GDP growth rates may not have changed that much, but when there were various financial crises, the spreads are likely to have changed by a lot across all countries. Time fixed effects would pick those up.

Kit Baum   |   Boston College Economics and DIW Berlin   |   http://ideas.repec.org/e/pba1.html
An Introduction to Stata Programming   |   http://www.stata-press.com/books/isp.html
An Introduction to Modern Econometrics Using Stata   |   http://www.stata-press.com/books/imeus.html

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```