Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Econometrics question

From	kokootchke <[email protected]>
To	statalist <[email protected]>
Subject	RE: st: Econometrics question
Date	Mon, 29 Mar 2010 17:47:09 -0400


> Adrian said
>
>
> I have a basic econometric question and I'm hoping you can help me out. I am running a regression of bond spreads on various variables denoting domestic economic conditions, and country fixed effects; I'm clustering my standard errors by quarter, e.g.
>
> xi: regress LogSpread GDPgrowth DebtToGDP i.country, cluster(time)
>
> I have quarterly data for 40 different countries, although it's a very unbalanced panel because the spread of the bond is for new bond issues and a lot of countries don't issue new bonds every quarter. So, the data would look something like this:
>
> Country Time Spread GDPgrowth
> Argentina 1991q1 400 3.0
> Argentina 1994q4 450 2.5
> Argentina 2001q3 800 0.7
> Brazil 1993q2 ...
> Brazil 1993q4 ...
> Brazil 1994q1 ...
> Colombia ...
> ...
>
> When I run a simple regression like the one above for the full sample, I obtain a coefficient for GDPgrowth of -0.073***
>
> Then if I run this same regression for two separate subsamples for the years 1991-1997 and 1998-2006, my coefficients for GDPgrowth are -0.056 and 0.009, both insignificant.
>
> In my experience, the full sample coefficient would in general be some sort of weighted average of the two coefficients obtained from subsample regressions. So, I don't understand why this is not the case here...
>
> The number of observations in the two subsamples add up to the number of observations in the full sample estimations.
>


Kit Baum said:

> Other posters suggested that this might be due to the presence of additional explanatory variables in your model, noting that your intuition should hold in the context of a univariate (y on x) regression. But in your case you are essentially running a fixed-effects panel regression, with country fixed effects. The correlation between variations in GDP growth around its country-specific mean and variations in LogSpread around its country-specific mean may well have shifted over time. In a very unbalanced panel, I would guess that the number of observations per country might be very small, so the demeaning (within) transformation might be introducing quite a bit of noise when applied to the shorter sample. 



The number of observations is indeed small for some countries but not for all of them. For example, the num of obs before and after 1998 is 21/14 for Argentina, 21/25 for Brazil, and 18/14 for China. These are some of the larger bond issuers. For countries that don't issue bonds very often, the obs pre/post 1998 looks more like: Bulgaria 1/1, Jamaica 2/9, Poland 5/5, Pakistan 3/3.



Why don't you try running
>
> xtreg LogSpread GDPgrowth DebtToGDP, fe cluster(time) for the full sample and for the two subsamples. You should get the same coefficients, but will also get some information about the amount of data used to compute the within estimates.


I include some results below (the results I include here are WITHOUT the cluster(time) option because Stata told me that "panels are not nested within clusters" -- I'm not sure what this means). If I understand these outputs correctly (and if I'm looking at the right things), I guess that what you are alluding to is that the full sample uses a different number of countries and a different number of obs per group compared to each subsample, is this right? For example, the full sample uses 39 countries vs. 34 and 36 for pre and post 1998, respectively. The average num of obs per group in the full sample is 14.8 vs. 6.3 and 10.1. Would these difference be the culprit?



>
> You also might want to consider adding time effects, which I expect are likely to be highly significant in these data. That is, GDP growth rates may not have changed that much, but when there were various financial crises, the spreads are likely to have changed by a lot across all countries. Time fixed effects would pick those up.



My results look at the regular OLS estimates and then the country FE and the year FE estimates, to try to understand the differences of these effects in the cross-section and over time. In principle, I also think I should have country and year effects simultaneously... however in practice this may not be a good thing because some countries only issue 1 or 2 bonds per year or none at all, which would leave me with very little variation to explore. Nonetheless, when I include both country and year FE, my results look very similar to the results with only country FE.

Also, note that in the results below I'm also controlling for "global" economic conditions (U.s. interest rates, 10-year swap spreads, volatility of EMBI spreads...) -- not individual episodes of crises, but these should pick up some of these simultaneous changes in spreads you suggest).

These are the results you suggested I run:


. xtreg lwtspread L1ggdp L1edtgdp L1tdsxgs L1resimp L1current lamount wtmat lus1    lswap    embivol    icrg_othersavg    , fe r

Fixed-effects (within) regression               Number of obs      =       577
Group variable: coucode                         Number of groups   =        39

R-sq:  within  = 0.2646                         Obs per group: min =         1
between = 0.0286                                        avg =      14.8
overall = 0.2171                                        max =        54

F(11,38)           =     15.74
corr(u_i, Xb)  = -0.0340                        Prob> F           =    0.0000

(Std. Err. adjusted for 39 clusters in coucode)

Robust
lwtspread       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]

L1ggdp   -.0729411   .0288699    -2.53   0.016    -.1313851   -.0144971
L1edtgdp    .9065034   .2597582     3.49   0.001     .3806503    1.432356
L1tdsxgs    .3097701   .2104923     1.47   0.149    -.1163492    .7358895
L1resimp     .008808   .0599197     0.15   0.884    -.1124932    .1301091
L1currenta~P     .028612     .01048     2.73   0.010     .0073962    .0498277
lamount    .0230932   .0311249     0.74   0.463    -.0399158    .0861023
wtmat    .0176812    .004187     4.22   0.000      .009205    .0261574
lus1   -.1535447   .0760022    -2.02   0.050    -.3074031    .0003136
lswap    .3376648   .0899795     3.75   0.001     .1555109    .5198188
embivol    .0015103   .0005278     2.86   0.007     .0004418    .0025787
icrg_other~g   -.0597775   .0112887    -5.30   0.000    -.0826304   -.0369246
_cons    7.747563   .8559409     9.05   0.000     6.014802    9.480325

sigma_u   .57511148
sigma_e    .5347369
rho   .53633039   (fraction of variance due to u_i)


. est store a

. xtreg lwtspread L1ggdp L1edtgdp L1tdsxgs L1resimp L1current lamount wtmat lus1    lswap    embivol    icrg_othersavg    if year<1998, fe r

Fixed-effects (within) regression               Number of obs      =       215
Group variable: coucode                         Number of groups   =        34

R-sq:  within  = 0.0838                         Obs per group: min =         1
between = 0.0590                                        avg =       6.3
overall = 0.0719                                        max =        27

F(11,33)           =      4.52
corr(u_i, Xb)  = -0.0641                        Prob> F           =    0.0004

(Std. Err. adjusted for 34 clusters in coucode)

Robust
lwtspread       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]

L1ggdp   -.0559695   .0438133    -1.28   0.210    -.1451084    .0331694
L1edtgdp    .9433012   .9674916     0.97   0.337    -1.025075    2.911678
L1tdsxgs    .1036246   .3152987     0.33   0.744    -.5378555    .7451047
L1resimp    .0109912   .0501988     0.22   0.828    -.0911391    .1131215
L1currenta~P     .015879   .0411585     0.39   0.702    -.0678586    .0996166
lamount   -.0140232   .0257043    -0.55   0.589    -.0663189    .0382725
wtmat    .0105855   .0038413     2.76   0.009     .0027703    .0184007
lus1   -.1456119   .2018347    -0.72   0.476    -.5562477     .265024
lswap     .043539   .3455338     0.13   0.900    -.6594549    .7465328
embivol   -.0010133   .0013232    -0.77   0.449    -.0037053    .0016787
icrg_other~g   -.0253633   .0147068    -1.72   0.094    -.0552846     .004558
_cons    6.792991   1.972731     3.44   0.002      2.77944    10.80654

sigma_u   .56983386
sigma_e   .47962122
rho   .58533087   (fraction of variance due to u_i)


. est store b

. xtreg lwtspread L1ggdp L1edtgdp L1tdsxgs L1resimp L1current lamount wtmat lus1    lswap    embivol    icrg_othersavg    if year>=1998, fe r

Fixed-effects (within) regression               Number of obs      =       362
Group variable: coucode                         Number of groups   =        36

R-sq:  within  = 0.3305                         Obs per group: min =         1
between = 0.0409                                        avg =      10.1
overall = 0.2240                                        max =        28

F(11,35)           =     17.77
corr(u_i, Xb)  = -0.1074                        Prob> F           =    0.0000

(Std. Err. adjusted for 36 clusters in coucode)

Robust
lwtspread       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]

L1ggdp    .0092044   .0229911     0.40   0.691      -.03747    .0558788
L1edtgdp    .8985159   .2941651     3.05   0.004     .3013291    1.495703
L1tdsxgs    .4459004   .3004651     1.48   0.147    -.1640763    1.055877
L1resimp    -.081875   .0500047    -1.64   0.111    -.1833898    .0196399
L1currenta~P    .0361645   .0101164     3.57   0.001      .015627    .0567019
lamount    -.004259   .0549545    -0.08   0.939    -.1158226    .1073045
wtmat    .0264663   .0054716     4.84   0.000     .0153584    .0375742
lus1   -.1685346   .0844278    -2.00   0.054    -.3399321    .0028629
lswap    .3691567   .1319395     2.80   0.008     .1013053     .637008
embivol    .0014359   .0007466     1.92   0.063    -.0000797    .0029515
icrg_other~g   -.1377502   .0390392    -3.53   0.001     -.217004   -.0584964
_cons    13.33052   2.992139     4.46   0.000     7.256157    19.40489

sigma_u   .62537276
sigma_e   .52385355
rho   .58765301   (fraction of variance due to u_i)


Thanks a lot!
Adrian



>
>
> Kit Baum | Boston College Economics and DIW Berlin | http://ideas.repec.org/e/pba1.html
> An Introduction to Stata Programming | http://www.stata-press.com/books/isp.html
> An Introduction to Modern Econometrics Using Stata | http://www.stata-press.com/books/imeus.html
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
 		 	   		  
_________________________________________________________________
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_3
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
References:
- re: st: Econometrics question
  - From: Christopher Baum <[email protected]>
Prev by Date: st: RE: How do I obtain confidence intervals for percentiles with survey data ?
Next by Date: Re: st: biprobit: set correlation of residuals to zero
Previous by thread: re: st: Econometrics question
Next by thread: Re: st: biprobit: set correlation of residuals to zero
Index(es):
- Date
- Thread