Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: RE: Interpreting Kleibergen Paap weak instrument statistic

From	"Fitzgerald, James" <[email protected]>
To	"[email protected]" <[email protected]>
Subject	st: RE: RE: Interpreting Kleibergen Paap weak instrument statistic
Date	Mon, 25 Jun 2012 13:53:18 +0000
Mark,

Thank you very much for your reply.

I have a few follow-up questions that you might be able to help me with. First though I thought it might be helpful if I gave a quick synopsis of my research question.

I am investigating the determinants of capital structure in UK Plcs, and my main hypothesis is that the theories espoused in the extant literature are only applicable to certain types of firms. 
As such, I divide my sample into sub-samples based on certain firm characteristics i.e. size, tangibility of assets etc., and compare regressor coefficients across the sub-samples. 
However, I was initially worried that such a categorisation procedure might introduce endogeneity issues that might vary across sub-samples, and thus I would not be able to reliably compare coefficients across sub-samples. Hence I decided to employ instrumental variables (lagged independent variables) to over come such issues. Within each sub-sample I test the orthogonality assumption of my included regressors (on an individual basis) using the orthog option in xtivreg2. Any variables I find to be potentially endogenous (C-stat p-value <0.100) are then instrumented where instruments are available. 
I am currently unaware of any method to correctly test the i.i.d. assumption using xtivreg2, and so I have decided to drop the assumption, and hence my question with regards the KP stat. 

With regards to your earlier reply, the following are some follow up questions I still have.

1. Is there an option in ivreg2 to test the i.i.d. assumption, and if not, how would i go about testing same?

2. With regards to the Anderson-Rubin statistic and the Stock-Wright LM S statistic, both of which are reported by xtivreg2, am I correct in my interpretation that given that they both test the joint hypotheses of weak instruments and orthogonality, the statistics are only interpretable from a weak instruments perspective as long as the Hansen J test of all excluded instruments indicates orthogonality conditions are valid? 

3.Included below is the first stage regression results from one of the tests I run. As you can see the Cragg Donald and Kleibergen Paap stats both suggest that the instruments are not weak. However, the AR and SW stats suggest that the instruments, given that the Hansen J-test does not reject the null, are potentially weak. From the output these stats appear to me to be testing the explanatory power of the instrument rather than whether or not it is weak i.e. 

Weak-instrument-robust inference
Tests of joint significance of endogenous regressors B1 in main equation
Ho: B1=0 and orthogonality conditions are valid

The coefficient significance level of the instrumented variable (liq) is relatively low (p-value = 0.084), but the instrument does not appear to be weak (based on CD and KP stats). However, I would conclude that it potentially is weak based on the AR and SW stats.
Is my interpretation incorrect, and if so could you indicate how these stats ought to be interpreted?

I greatly appreciate any help you can offer

Best regards

James

Summary results for first-stage regressions

                          (Underid)                                            (Weak id)
Variable      F(  4,  2541)  P-val  AP Chi-sq(  4) P-val  AP F(  4,  2541)
liq                20.20    0.0000        81.78   0.0000        20.20

NB: first-stage test statistics heteroskedasticity and autocorrelation-robust

Stock-Yogo weak ID test critical values for single endogenous regressor:
5% maximal IV relative bias    16.85
10% maximal IV relative bias    10.27
20% maximal IV relative bias     6.71
30% maximal IV relative bias     5.34
10% maximal IV size             24.58
15% maximal IV size             13.96
20% maximal IV size             10.26
25% maximal IV size              8.31
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.

Underidentification test
Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
Ha: matrix has rank=K1 (identified)
Kleibergen-Paap rk LM statistic          Chi-sq(4)=58.30    P-val=0.0000

Weak identification test
Ho: equation is weakly identified
Cragg-Donald Wald F statistic                                      78.65
Kleibergen-Paap Wald rk F statistic                                20.20
Stock-Yogo weak ID test critical values for K1=1 and L1=4:
5% maximal IV relative bias    16.85
10% maximal IV relative bias    10.27
20% maximal IV relative bias     6.71
30% maximal IV relative bias     5.34
10% maximal IV size             24.58
15% maximal IV size             13.96
20% maximal IV size             10.26
25% maximal IV size              8.31
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.

Weak-instrument-robust inference
Tests of joint significance of endogenous regressors B1 in main equation
Ho: B1=0 and orthogonality conditions are valid
Anderson-Rubin Wald test           F(4,2541)=      2.26     P-val=0.0607
Anderson-Rubin Wald test           Chi-sq(4)=      9.14     P-val=0.0577
Stock-Wright LM S statistic        Chi-sq(4)=      9.22     P-val=0.0557
NB: Underidentification, weak identification and weak-identification-robust
test statistics heteroskedasticity and autocorrelation-robust

Number of observations               N  =       3021
Number of regressors                 K  =         28
Number of endogenous regressors      K1 =          1
Number of instruments                L  =         31
Number of excluded instruments       L1 =          4
2-Step GMM estimation

Estimates efficient for arbitrary heteroskedasticity and autocorrelation
Statistics robust to heteroskedasticity and autocorrelation
kernel=Bartlett; bandwidth=2
time variable (t):  year
group variable (i): firm
Number of obs =     3021
F( 28,  2544) =     3.02
Prob > F      =   0.0000
Total (centered) SS     =  21.06783592                Centered R2   =   0.0261
Total (uncentered) SS   =  21.06783592                Uncentered R2 =   0.0261
Residual SS             =  20.51803233                Root MSE      =   .08932

Robust
ltdbv       Coef.   Std. Err.      z    P>z     [95% Conf. Interval]
liq   -.0085538   .0049465    -1.73   0.084    -.0182487    .0011411
lnsale    .0053743   .0052578     1.02   0.307    -.0049307    .0156794
tang    .1170177   .0610377     1.92   0.055    -.0026139    .2366493
itang    .0557467   .0239463     2.33   0.020     .0088127    .1026806
itangdum    .0123551   .0065003     1.90   0.057    -.0003853    .0250955
tax   -.0193497     .00924    -2.09   0.036    -.0374598   -.0012396
prof    .0025405   .0027681     0.92   0.359    -.0028849    .0079659
mtb   -.0019451   .0019992    -0.97   0.331    -.0058635    .0019733
capexsa    .0108254   .0087886     1.23   0.218       -.0064    .0280507
ndts   -.0022495   .0032416    -0.69   0.488     -.008603     .004104
yr90   -.0860865   .1693451    -0.51   0.611    -.4179968    .2458238
yr91   -.0057954   .0156291    -0.37   0.711     -.036428    .0248371
yr92    .0060493   .0148008     0.41   0.683    -.0229596    .0350583
yr93   -.0066494   .0154936    -0.43   0.668    -.0370163    .0237174
yr94   -.0038801   .0137634    -0.28   0.778    -.0308559    .0230956
yr95   -.0021814   .0139629    -0.16   0.876    -.0295482    .0251854
yr96     .007044   .0137418     0.51   0.608    -.0198895    .0339775
yr97    .0119441   .0134385     0.89   0.374    -.0143949    .0382831
yr98    .0069794    .013185     0.53   0.597    -.0188627    .0328216
yr99    .0132963   .0125952     1.06   0.291    -.0113898    .0379825
yr00    .0080221   .0119826     0.67   0.503    -.0154633    .0315074
yr01   -.0000815   .0107388    -0.01   0.994    -.0211291    .0209661
yr02    .0001449   .0106504     0.01   0.989    -.0207295    .0210193
yr03    .0106314   .0115621     0.92   0.358    -.0120299    .0332926
yr04    .0097052   .0102908     0.94   0.346    -.0104643    .0298748
yr05    .0156916   .0108831     1.44   0.149    -.0056388    .0370221
yr06    .0093837   .0108831     0.86   0.389    -.0119467    .0307142
yr07     .005672   .0086985     0.65   0.514    -.0113768    .0227207

Underidentification test (Kleibergen-Paap rk LM statistic):             58.301
Chi-sq(4) P-val =    0.0000

Weak identification test (Cragg-Donald Wald F statistic):               78.647
(Kleibergen-Paap rk Wald F statistic):         20.198
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    16.85
10% maximal IV relative bias    10.27
20% maximal IV relative bias     6.71
30% maximal IV relative bias     5.34
10% maximal IV size             24.58
15% maximal IV size             13.96
20% maximal IV size             10.26
25% maximal IV size              8.31
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.

Hansen J statistic (overidentification test of all instruments):         5.596
Chi-sq(3) P-val =    0.1330
Instrumented:         liq
Included instruments: lnsale tang itang itangdum tax prof mtb capexsa ndts yr90
yr91 yr92 yr93 yr94 yr95 yr96 yr97 yr98 yr99 yr00 yr01
yr02 yr03 yr04 yr05 yr06 yr07
Excluded instruments: tang1 itang1 mtb1 liq1
Dropped collinear:    yr08

. 

________________________________________
From: [email protected] [[email protected]] on behalf of Schaffer, Mark E [[email protected]]
Sent: 25 June 2012 12:33
To: [email protected]
Subject: st: RE: Interpreting Kleibergen Paap weak instrument statistic

James,

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Fitzgerald, James
> Sent: 21 June 2012 14:02
> To: [email protected]
> Subject: st: Interpreting Kleibergen Paap weak instrument statistic
>
> Hi Statalist users
>
> I am using xtivreg2 to estimate a GMM-IV model (I specify the
> following options; fe robust bw(2) gmm2s). I am not assuming
> i.i.d errors, and thus when testing for weak instruments I am
> using the Kleibergen Paap rk wald F statistic rather than the
> Cragg Donald wald F statistic.
>
> xtivreg2 produces Stock-Yogo critical values for the Cragg
> Donald statistic assuming i.i.d errors, so I'm not sure how
> to interpret the KP rk wald F stat.
>
> The help file for ivreg2 (Baum, Schaffer and Stillman, 2010)
> does however mention the following:
>
> When the i.i.d. assumption is dropped and ivreg2 is invoked
> with the robust, bw or cluster options, the
> Cragg-Donald-based weak instruments test is no longer valid.
> ivreg2 instead reports a correspondingly-robust
> Kleibergen-Paap Wald rk F statistic.  The degrees of freedom
> adjustment for the rk statistic is (N-L)/L1, as with the
> Cragg-Donald F statistic, except in the cluster-robust case,
> when the adjustment is N/(N-1) * (N_clust-1)/N_clust,
> following the standard Stata small-sample adjustment for
> cluster-robust. In the case of two-way clustering, N_clust is
> the minimum of N_clust1 and N_clust2.  The critical values
> reported by ivreg2 for the Kleibergen-Paap statistic are the
> Stock-Yogo critical values for the Cragg-Donald i.i.d. case.
> The critical values reported with 2-step GMM are the
> Stock-Yogo IV critical values, and the critical values
> reported with CUE are the LIML critical values.
>
>
> My understanding of the end of the paragraph is that the KP
> stat can still be compared to the Stock-Yogo values produced
> by STATA in determining whether or not instruments are weak.
>
> If someone could confirm or reject this I would be eternally
> grateful!!

I wrote that paragraph, so the ambiguity is partly my fault.  But the
problem is that there are no concrete results in the literature for
testing for weak IVs when the i.i.d. assumption fails.  The only thing
one can do (that I'm aware of, anyway) is to point to stats that have an
asymptotic justification in a test of underidentification, which is what
the output of -ivreg2- does.  That is, the K-P stat can be used to test
for underidentification without the i.i.d. assumption, and under i.i.d.
it has the same distribution under the null as the Cragg-Donald stat.
This justification is different from that underlying the Stock-Yogo
critical values, so this is pretty hand-wavey.

The alternative is weak-instrument-robust estimation, a la
Anderson-Rubin, Moreira, Kleibergen, etc.  The Finlay-Magnusson
-rivtest- command, available via ssc ideas in the usual way, supports
this.  Also see their accompanying SJ paper (vol. 9 no. 3).  The command
doesn't directly support panel data estimation, which is what you have,
but you could just demean your variables by hand.

HTH,
Mark


> Best wishes
>
> James Fitzgerald
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>


--
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
References:
- st: Interpreting Kleibergen Paap weak instrument statistic
  - From: "Fitzgerald, James" <[email protected]>
- st: RE: Interpreting Kleibergen Paap weak instrument statistic
  - From: "Schaffer, Mark E" <[email protected]>
Prev by Date: Re: st: Tables in Tex
Next by Date: Re: st: how to do quantile regression in panel data
Previous by thread: st: RE: Interpreting Kleibergen Paap weak instrument statistic
Next by thread: st: Right-skewed dependent variable and spatial autocorrelation
Index(es):
- Date
- Thread