Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: Interpreting Kleibergen Paap weak instrument statistic


From   "Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: Interpreting Kleibergen Paap weak instrument statistic
Date   Mon, 25 Jun 2012 15:52:10 +0100

James,

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Fitzgerald,

> James
> Sent: 25 June 2012 14:53
> To: statalist@hsphsun2.harvard.edu
> Subject: st: RE: RE: Interpreting Kleibergen Paap weak instrument 
> statistic
> 
> Mark,
> 
> Thank you very much for your reply.
> 
> I have a few follow-up questions that you might be able to help me 
> with. First though I thought it might be helpful if I gave a quick 
> synopsis of my research question.
> 
> I am investigating the determinants of capital structure in UK Plcs, 
> and my main hypothesis is that the theories espoused in the extant 
> literature are only applicable to certain types of firms.
> As such, I divide my sample into sub-samples based on certain firm 
> characteristics i.e. size, tangibility of assets etc., and compare 
> regressor coefficients across the sub-samples.

I'm not sure I understand.  Do you estimate separately for the different
subsamples, or do you interact your coefficients with indicator
variables and estimate one big regression?

> However, I was initially worried that such a categorisation procedure 
> might introduce endogeneity issues that might vary across sub-samples,

> and thus I would not be able to reliably compare coefficients across 
> sub-samples. Hence I decided to employ instrumental variables (lagged 
> independent variables) to over come such issues. Within each 
> sub-sample I test the orthogonality assumption of my included 
> regressors (on an individual basis) using the orthog option in 
> xtivreg2. Any variables I find to be potentially endogenous (C-stat 
> p-value
> <0.100) are then instrumented where instruments are available. 
> I am currently unaware of any method to correctly test the i.i.d. 
> assumption using xtivreg2, and so I have decided to drop the 
> assumption, and hence my question with regards the KP stat.
> 
> With regards to your earlier reply, the following are some follow up 
> questions I still have.
> 
> 1. Is there an option in ivreg2 to test the i.i.d. 
> assumption, and if not, how would i go about testing same?

This amounts to testing for heteroskedasticity or autocorrelation.
-ivhettest- and -ivactest- will report such tests for IV models.  But
you are using a fixed effects model, which complicates things a bit.
How long is your T dimension?  I see from the estimation below that you
are using a kernel-robust VCE, which implies T is biggish.  If so, you
could apply the fixed effects transformation to your data by hand (e.g.,
using Ben Jann's -center- command) and then use these programs.  But
this is a bit tricky.

The simplest way to test the i.i.d. assumption is to do an eyeball
version of a White-type test.  Estimate the model using kernel-robust
VCEs, and then again without this option, i.e., using the classical VCE.
Do the SEs look very different?  If so, it's likely that the i.i.d.
assumption would fail if you tested formally using a White-type test,
since the same principle is involved - the test stat is based on a
vector of contrasts between the robust and classical VCEs.

> 2. With regards to the Anderson-Rubin statistic and the Stock-Wright 
> LM S statistic, both of which are reported by xtivreg2, am I correct 
> in my interpretation that given that they both test the joint 
> hypotheses of weak instruments and orthogonality, the statistics are 
> only interpretable from a weak instruments perspective as long as the 
> Hansen J test of all excluded instruments indicates orthogonality 
> conditions are valid?

Sort of ... it's a litte more complicated than that.  I recommend
reading the Finlay-Magnusson paper on this.

> 3.Included below is the first stage regression results from one of the

> tests I run.

Maybe I am misreading the output, but it looks like only the summary
stats for the first stage are reported.

> As you can see the Cragg Donald and
> Kleibergen Paap stats both suggest that the instruments are not weak. 
> However, the AR and SW stats suggest that the instruments, given that 
> the Hansen J-test does not reject the null, are potentially weak.

No, that's a misintepretation of the AR and SW tests.  See below.

> From the output these stats
> appear to me to be testing the explanatory power of the instrument 
> rather than whether or not it is weak

Neither.  These are not tests of the strength or explanatory power of
the IV.  They are just what the output says: tests of the significance
of the endogenous regressor.

Your endogenous regressor is liq.  In the main output, the coeff on liq
is -.0085538, with a z-stat of -1.73 and a p-value of 0.084.  That is,
the Wald test stat for the null that the coeff on liq=0 has a p-value of
0.084.

The A-R test stat (F version) for the same hypothesis, i.e., B1=0,
augmented by the additional hypothesis that the IVs are exogenous, has a
p-value of 0.0607.  Very similar.

The A-R-type approach can be extended to generate weak-instrument-robust
confidence intervals.  That's what Finlay & Magnusson's -rivtest- will
do for you.

HTH,
Mark

> i.e. 
> 
> Weak-instrument-robust inference
> Tests of joint significance of endogenous regressors B1 in main 
> equation
> Ho: B1=0 and orthogonality conditions are valid
> 
> The coefficient significance level of the instrumented variable (liq) 
> is relatively low (p-value = 0.084), but the instrument does not 
> appear to be weak (based on CD and KP stats). However, I would 
> conclude that it potentially is weak based on the AR and SW stats.
> Is my interpretation incorrect, and if so could you indicate how these

> stats ought to be interpreted?
> 
> I greatly appreciate any help you can offer
> 
> Best regards
> 
> James
> 
> Summary results for first-stage regressions
> 
>                           (Underid)                           
>                  (Weak id)
> Variable      F(  4,  2541)  P-val  AP Chi-sq(  4) P-val  AP 
> F(  4,  2541)
> liq                20.20    0.0000        81.78   0.0000        20.20
> 
> NB: first-stage test statistics heteroskedasticity and 
> autocorrelation-robust
> 
> Stock-Yogo weak ID test critical values for single endogenous
> regressor:
> 5% maximal IV relative bias    16.85
> 10% maximal IV relative bias    10.27
> 20% maximal IV relative bias     6.71
> 30% maximal IV relative bias     5.34
> 10% maximal IV size             24.58
> 15% maximal IV size             13.96
> 20% maximal IV size             10.26
> 25% maximal IV size              8.31
> Source: Stock-Yogo (2005).  Reproduced by permission.
> NB: Critical values are for Cragg-Donald F statistic and i.i.d. 
> errors.
> 
> Underidentification test
> Ho: matrix of reduced form coefficients has rank=K1-1
> (underidentified)
> Ha: matrix has rank=K1 (identified)
> Kleibergen-Paap rk LM statistic          Chi-sq(4)=58.30    
> P-val=0.0000
> 
> Weak identification test
> Ho: equation is weakly identified
> Cragg-Donald Wald F statistic                                 
>      78.65
> Kleibergen-Paap Wald rk F statistic                           
>      20.20
> Stock-Yogo weak ID test critical values for K1=1 and L1=4:
> 5% maximal IV relative bias    16.85
> 10% maximal IV relative bias    10.27
> 20% maximal IV relative bias     6.71
> 30% maximal IV relative bias     5.34
> 10% maximal IV size             24.58
> 15% maximal IV size             13.96
> 20% maximal IV size             10.26
> 25% maximal IV size              8.31
> Source: Stock-Yogo (2005).  Reproduced by permission.
> NB: Critical values are for Cragg-Donald F statistic and i.i.d. 
> errors.
> 
> Weak-instrument-robust inference
> Tests of joint significance of endogenous regressors B1 in main 
> equation
> Ho: B1=0 and orthogonality conditions are valid
> Anderson-Rubin Wald test           F(4,2541)=      2.26     
> P-val=0.0607
> Anderson-Rubin Wald test           Chi-sq(4)=      9.14     
> P-val=0.0577
> Stock-Wright LM S statistic        Chi-sq(4)=      9.22     
> P-val=0.0557
> NB: Underidentification, weak identification and 
> weak-identification-robust test statistics heteroskedasticity and 
> autocorrelation-robust
> 
> Number of observations               N  =       3021
> Number of regressors                 K  =         28
> Number of endogenous regressors      K1 =          1
> Number of instruments                L  =         31
> Number of excluded instruments       L1 =          4
> 2-Step GMM estimation
> 
> Estimates efficient for arbitrary heteroskedasticity and 
> autocorrelation Statistics robust to heteroskedasticity and 
> autocorrelation kernel=Bartlett; bandwidth=2 time variable (t):  year 
> group variable (i): firm
> Number of obs =     3021
> F( 28,  2544) =     3.02
> Prob > F      =   0.0000
> Total (centered) SS     =  21.06783592                
> Centered R2   =   0.0261
> Total (uncentered) SS   =  21.06783592                
> Uncentered R2 =   0.0261
> Residual SS             =  20.51803233                Root 
> MSE      =   .08932
> 
> Robust
> ltdbv       Coef.   Std. Err.      z    P>z     [95% Conf. Interval]
> liq   -.0085538   .0049465    -1.73   0.084    -.0182487    .0011411
> lnsale    .0053743   .0052578     1.02   0.307    -.0049307   
>  .0156794
> tang    .1170177   .0610377     1.92   0.055    -.0026139    .2366493
> itang    .0557467   .0239463     2.33   0.020     .0088127    .1026806
> itangdum    .0123551   .0065003     1.90   0.057    -.0003853 
>    .0250955
> tax   -.0193497     .00924    -2.09   0.036    -.0374598   -.0012396
> prof    .0025405   .0027681     0.92   0.359    -.0028849    .0079659
> mtb   -.0019451   .0019992    -0.97   0.331    -.0058635    .0019733
> capexsa    .0108254   .0087886     1.23   0.218       -.0064  
>   .0280507
> ndts   -.0022495   .0032416    -0.69   0.488     -.008603     .004104
> yr90   -.0860865   .1693451    -0.51   0.611    -.4179968    .2458238
> yr91   -.0057954   .0156291    -0.37   0.711     -.036428    .0248371
> yr92    .0060493   .0148008     0.41   0.683    -.0229596    .0350583
> yr93   -.0066494   .0154936    -0.43   0.668    -.0370163    .0237174
> yr94   -.0038801   .0137634    -0.28   0.778    -.0308559    .0230956
> yr95   -.0021814   .0139629    -0.16   0.876    -.0295482    .0251854
> yr96     .007044   .0137418     0.51   0.608    -.0198895    .0339775
> yr97    .0119441   .0134385     0.89   0.374    -.0143949    .0382831
> yr98    .0069794    .013185     0.53   0.597    -.0188627    .0328216
> yr99    .0132963   .0125952     1.06   0.291    -.0113898    .0379825
> yr00    .0080221   .0119826     0.67   0.503    -.0154633    .0315074
> yr01   -.0000815   .0107388    -0.01   0.994    -.0211291    .0209661
> yr02    .0001449   .0106504     0.01   0.989    -.0207295    .0210193
> yr03    .0106314   .0115621     0.92   0.358    -.0120299    .0332926
> yr04    .0097052   .0102908     0.94   0.346    -.0104643    .0298748
> yr05    .0156916   .0108831     1.44   0.149    -.0056388    .0370221
> yr06    .0093837   .0108831     0.86   0.389    -.0119467    .0307142
> yr07     .005672   .0086985     0.65   0.514    -.0113768    .0227207
> 
> Underidentification test (Kleibergen-Paap rk LM statistic):   
>           58.301
> Chi-sq(4) P-val =    0.0000
> 
> Weak identification test (Cragg-Donald Wald F statistic):     
>           78.647
> (Kleibergen-Paap rk Wald F statistic):         20.198
> Stock-Yogo weak ID test critical values:  5% maximal IV 
> relative bias    16.85
> 10% maximal IV relative bias    10.27
> 20% maximal IV relative bias     6.71
> 30% maximal IV relative bias     5.34
> 10% maximal IV size             24.58
> 15% maximal IV size             13.96
> 20% maximal IV size             10.26
> 25% maximal IV size              8.31
> Source: Stock-Yogo (2005).  Reproduced by permission.
> NB: Critical values are for Cragg-Donald F statistic and i.i.d. 
> errors.
> 
> Hansen J statistic (overidentification test of all 
> instruments):         5.596
> Chi-sq(3) P-val =    0.1330
> Instrumented:         liq
> Included instruments: lnsale tang itang itangdum tax prof mtb capexsa 
> ndts yr90
> yr91 yr92 yr93 yr94 yr95 yr96 yr97 yr98 yr99 yr00 yr01
> yr02 yr03 yr04 yr05 yr06 yr07
> Excluded instruments: tang1 itang1 mtb1 liq1
> Dropped collinear:    yr08
> 
> . 
> 
> ________________________________________
> From: owner-statalist@hsphsun2.harvard.edu
> [owner-statalist@hsphsun2.harvard.edu] on behalf of Schaffer, Mark E 
> [M.E.Schaffer@hw.ac.uk]
> Sent: 25 June 2012 12:33
> To: statalist@hsphsun2.harvard.edu
> Subject: st: RE: Interpreting Kleibergen Paap weak instrument 
> statistic
> 
> James,
> 
> > -----Original Message-----
> > From: owner-statalist@hsphsun2.harvard.edu
> > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of 
> > Fitzgerald, James
> > Sent: 21 June 2012 14:02
> > To: statalist@hsphsun2.harvard.edu
> > Subject: st: Interpreting Kleibergen Paap weak instrument statistic
> >
> > Hi Statalist users
> >
> > I am using xtivreg2 to estimate a GMM-IV model (I specify the 
> > following options; fe robust bw(2) gmm2s). I am not assuming i.i.d 
> > errors, and thus when testing for weak instruments I am using the 
> > Kleibergen Paap rk wald F statistic rather than the Cragg Donald 
> > wald F statistic.
> >
> > xtivreg2 produces Stock-Yogo critical values for the Cragg Donald 
> > statistic assuming i.i.d errors, so I'm not sure how to interpret 
> > the KP rk wald F stat.
> >
> > The help file for ivreg2 (Baum, Schaffer and Stillman, 2010) does 
> > however mention the following:
> >
> > When the i.i.d. assumption is dropped and ivreg2 is invoked with the

> > robust, bw or cluster options, the Cragg-Donald-based weak 
> > instruments test is no longer valid.
> > ivreg2 instead reports a correspondingly-robust Kleibergen-Paap Wald

> > rk F statistic.  The degrees of freedom adjustment for the rk 
> > statistic is (N-L)/L1, as with the Cragg-Donald F statistic, except 
> > in the cluster-robust case, when the adjustment is N/(N-1) * 
> > (N_clust-1)/N_clust, following the standard Stata small-sample 
> > adjustment for cluster-robust. In the case of two-way clustering, 
> > N_clust is the minimum of N_clust1 and N_clust2.  The critical 
> > values reported by ivreg2 for the Kleibergen-Paap statistic are the 
> > Stock-Yogo critical values for the Cragg-Donald i.i.d. case.
> > The critical values reported with 2-step GMM are the Stock-Yogo IV 
> > critical values, and the critical values reported with CUE are the 
> > LIML critical values.
> >
> >
> > My understanding of the end of the paragraph is that the KP stat can

> > still be compared to the Stock-Yogo values produced by STATA in 
> > determining whether or not instruments are weak.
> >
> > If someone could confirm or reject this I would be eternally 
> > grateful!!
> 
> I wrote that paragraph, so the ambiguity is partly my fault.  But the 
> problem is that there are no concrete results in the literature for 
> testing for weak IVs when the i.i.d. assumption fails.  The only thing

> one can do (that I'm aware of, anyway) is to point to stats that have 
> an asymptotic justification in a test of underidentification, which is

> what the output of -ivreg2- does.  That is, the K-P stat can be used 
> to test for underidentification without the i.i.d. assumption, and 
> under i.i.d.
> it has the same distribution under the null as the Cragg-Donald stat.
> This justification is different from that underlying the Stock-Yogo 
> critical values, so this is pretty hand-wavey.
> 
> The alternative is weak-instrument-robust estimation, a la 
> Anderson-Rubin, Moreira, Kleibergen, etc.  The Finlay-Magnusson
> -rivtest- command, available via ssc ideas in the usual way, supports 
> this.  Also see their accompanying SJ paper (vol. 9 no. 3).
> The command
> doesn't directly support panel data estimation, which is what you 
> have, but you could just demean your variables by hand.
> 
> HTH,
> Mark
> 
> 
> > Best wishes
> >
> > James Fitzgerald
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/help.cgi?search
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/


-- 
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index