Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Fitzgerald, James" <J.Fitzgerald2@ucc.ie> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
st: FW: RE: RE: RE: Interpreting Kleibergen Paap weak instrument statistic |

Date |
Mon, 25 Jun 2012 16:39:52 +0000 |

Mark, In my last e-mail I stated the following: "I think I now understand what the AR tests are reporting; the AR stat p-value (0.067) is interpreted in the same manner as the p-value for liq in the main output (0.084), but with the added orthogonality condition. And given that both p-values are very similar, I can infer with some degree of reliability that the instrument is not weak (that degree of reliability being dependent on the confidence intervals I can generate using Finlay and Magnusson's -rivtest-). Is that correct?" Upon reflection I realise (at least I think I do) that the AR stat provides a means to interpret the effect of the instrumented endogenous variable that is robust to the instrument being weak. However, if the AR p-value and the p-value from the main equation differ significantly then the orthogonality condition does not hold. Can a difference also indicate though that the instrument is actually weak? Regards James ________________________________________ From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Fitzgerald, James [J.Fitzgerald2@ucc.ie] Sent: 25 June 2012 16:51 To: statalist@hsphsun2.harvard.edu Subject: st: RE: RE: RE: Interpreting Kleibergen Paap weak instrument statistic Mark, ________________________________________ From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Schaffer, Mark E [M.E.Schaffer@hw.ac.uk] Sent: 25 June 2012 15:54 To: statalist@hsphsun2.harvard.edu Subject: st: RE: RE: Interpreting Kleibergen Paap weak instrument statistic James, > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Fitzgerald, > James > Sent: 25 June 2012 14:53 > To: statalist@hsphsun2.harvard.edu > Subject: st: RE: RE: Interpreting Kleibergen Paap weak instrument > statistic > > Mark, > > Thank you very much for your reply. > > I have a few follow-up questions that you might be able to help me > with. First though I thought it might be helpful if I gave a quick > synopsis of my research question. > > I am investigating the determinants of capital structure in UK Plcs, > and my main hypothesis is that the theories espoused in the extant > literature are only applicable to certain types of firms. > As such, I divide my sample into sub-samples based on certain firm > characteristics i.e. size, tangibility of assets etc., and compare > regressor coefficients across the sub-samples. I'm not sure I understand. Do you estimate separately for the different subsamples, or do you interact your coefficients with indicator variables and estimate one big regression? I estimate separately for the different sub-samples. I decided to take this approach as I am interested in how the effects of a number of the independent variables vary across the sub-samples, and was advised that indicator variables can only be employed for one variable at a time in a model. Furthermore, it was pointed out to me that a binary indicator variable is no longer binary after a fixed effects transformation i.e. indicator variables coded as 1 or 0 can take the values -1, 0, 1 after a first differences transformation, and can take T values after an about-the-mean transformation. > However, I was initially worried that such a categorisation procedure > might introduce endogeneity issues that might vary across sub-samples, > and thus I would not be able to reliably compare coefficients across > sub-samples. Hence I decided to employ instrumental variables (lagged > independent variables) to over come such issues. Within each > sub-sample I test the orthogonality assumption of my included > regressors (on an individual basis) using the orthog option in > xtivreg2. Any variables I find to be potentially endogenous (C-stat > p-value > <0.100) are then instrumented where instruments are available. > I am currently unaware of any method to correctly test the i.i.d. > assumption using xtivreg2, and so I have decided to drop the > assumption, and hence my question with regards the KP stat. > > With regards to your earlier reply, the following are some follow up > questions I still have. > > 1. Is there an option in ivreg2 to test the i.i.d. > assumption, and if not, how would i go about testing same? This amounts to testing for heteroskedasticity or autocorrelation. -ivhettest- and -ivactest- will report such tests for IV models. But you are using a fixed effects model, which complicates things a bit. How long is your T dimension? I see from the estimation below that you are using a kernel-robust VCE, which implies T is biggish. If so, you could apply the fixed effects transformation to your data by hand (e.g., using Ben Jann's -center- command) and then use these programs. But this is a bit tricky. The simplest way to test the i.i.d. assumption is to do an eyeball version of a White-type test. Estimate the model using kernel-robust VCEs, and then again without this option, i.e., using the classical VCE. Do the SEs look very different? If so, it's likely that the i.i.d. assumption would fail if you tested formally using a White-type test, since the same principle is involved - the test stat is based on a vector of contrasts between the robust and classical VCEs. I am using an unbalanced panel dataset, so my T dimension varies from 1 to 20. My understanding of the kernel-robust option is very limited, and I specify it so that my output is robust to autocorrelation. I think I will try your "eyeball" test suggestion, as I have about reached the limit of my econometric abilities! Thus, if I "see" major differences in the SEs the i.i.d assumption is invalid? > 2. With regards to the Anderson-Rubin statistic and the Stock-Wright > LM S statistic, both of which are reported by xtivreg2, am I correct > in my interpretation that given that they both test the joint > hypotheses of weak instruments and orthogonality, the statistics are > only interpretable from a weak instruments perspective as long as the > Hansen J test of all excluded instruments indicates orthogonality > conditions are valid? Sort of ... it's a litte more complicated than that. I recommend reading the Finlay-Magnusson paper on this. > 3.Included below is the first stage regression results from one of the > tests I run. Maybe I am misreading the output, but it looks like only the summary stats for the first stage are reported. Yes, I only included the summary first stage regression results. Below is the complete output produced by STATA. > As you can see the Cragg Donald and > Kleibergen Paap stats both suggest that the instruments are not weak. > However, the AR and SW stats suggest that the instruments, given that > the Hansen J-test does not reject the null, are potentially weak. No, that's a misintepretation of the AR and SW tests. See below. > From the output these stats > appear to me to be testing the explanatory power of the instrument > rather than whether or not it is weak Neither. These are not tests of the strength or explanatory power of the IV. They are just what the output says: tests of the significance of the endogenous regressor. Your endogenous regressor is liq. In the main output, the coeff on liq is -.0085538, with a z-stat of -1.73 and a p-value of 0.084. That is, the Wald test stat for the null that the coeff on liq=0 has a p-value of 0.084. The A-R test stat (F version) for the same hypothesis, i.e., B1=0, augmented by the additional hypothesis that the IVs are exogenous, has a p-value of 0.0607. Very similar. The A-R-type approach can be extended to generate weak-instrument-robust confidence intervals. That's what Finlay & Magnusson's -rivtest- will do for you. I think I now understand what the AR tests are reporting; the AR stat p-value (0.067) is interpreted in the same manner as the p-value for liq in the main output (0.084), but with the added orthogonality condition. And given that both p-values are very similar, I can infer with some degree of reliability that the instrument is not weak (that degree of reliability being dependent on the confidence intervals I can generate using Finlay and Magnusson's -rivtest-). Is that correct? Thanks again for your help James > i.e. > > Weak-instrument-robust inference > Tests of joint significance of endogenous regressors B1 in main > equation > Ho: B1=0 and orthogonality conditions are valid > > The coefficient significance level of the instrumented variable (liq) > is relatively low (p-value = 0.084), but the instrument does not > appear to be weak (based on CD and KP stats). However, I would > conclude that it potentially is weak based on the AR and SW stats. > Is my interpretation incorrect, and if so could you indicate how these > stats ought to be interpreted? > > I greatly appreciate any help you can offer > > Best regards > > James > . xtivreg2 ltdbv lnsale tang itang itangdum tax prof mtb capexsa ndts yr* (liq=tang1 itang1 mtb1 liq1) if lnsalesubs<1 & ta > ngsubs<1, fe robust bw(2) gmm2s first Warning - singleton groups detected. 91 observation(s) not used. Warning - collinearities detected Vars dropped: yr08 FIXED EFFECTS ESTIMATION Number of groups = 449 Obs per group: min = 2 avg = 6.7 max = 19 First-stage regressions First-stage regression of liq: FIXED EFFECTS ESTIMATION Number of groups = 449 Obs per group: min = 2 avg = 6.7 max = 19 OLS estimation Estimates efficient for homoskedasticity only Statistics robust to heteroskedasticity and autocorrelation kernel=Bartlett; bandwidth=2 time variable (t): year group variable (i): firm Number of obs = 3021 F( 31, 2541) = 8.82 Prob > F = 0.0000 Total (centered) SS = 6087.457806 Centered R2 = 0.2732 Total (uncentered) SS = 6087.457806 Uncentered R2 = 0.2732 Residual SS = 4424.113333 Root MSE = 1.32 Robust liq Coef. Std. Err. t P>t [95% Conf. Interval] lnsale -.3992946 .1006038 -3.97 0.000 -.5965684 -.2020207 tang -6.503772 1.007147 -6.46 0.000 -8.478685 -4.528859 itang -2.818454 .3907103 -7.21 0.000 -3.584597 -2.052311 itangdum .003545 .1125097 0.03 0.975 -.217075 .2241649 tax .0972279 .1132478 0.86 0.391 -.1248395 .3192952 prof .0405595 .0546733 0.74 0.458 -.0666492 .1477683 mtb -.0525982 .0277353 -1.90 0.058 -.1069843 .0017878 capexsa .8377125 .3265792 2.57 0.010 .197324 1.478101 ndts -.0143917 .0282565 -0.51 0.611 -.0697998 .0410164 yr90 1.155508 3.618686 0.32 0.750 -5.940366 8.251382 yr91 -.2388175 .2513692 -0.95 0.342 -.7317268 .2540919 yr92 -.3008198 .2453313 -1.23 0.220 -.7818894 .1802499 yr93 -.1499197 .2490001 -0.60 0.547 -.6381835 .338344 yr94 -.2144308 .2420701 -0.89 0.376 -.6891055 .2602439 yr95 -.2142347 .2435146 -0.88 0.379 -.691742 .2632725 yr96 -.0750504 .2473898 -0.30 0.762 -.5601566 .4100559 yr97 -.0568015 .2405942 -0.24 0.813 -.5285822 .4149792 yr98 -.2275228 .2263855 -1.01 0.315 -.6714416 .216396 yr99 .065933 .2331514 0.28 0.777 -.3912531 .5231191 yr00 .3334675 .2521301 1.32 0.186 -.1609339 .8278688 yr01 -.0156419 .2300491 -0.07 0.946 -.4667446 .4354608 yr02 .1622597 .2160337 0.75 0.453 -.2613603 .5858797 yr03 .0200205 .2144716 0.09 0.926 -.4005365 .4405775 yr04 .2405879 .219952 1.09 0.274 -.1907155 .6718912 yr05 .1176199 .2308627 0.51 0.610 -.3350784 .5703182 yr06 -.1331952 .2180932 -0.61 0.541 -.5608537 .2944633 yr07 -.370854 .2144122 -1.73 0.084 -.7912944 .0495865 tang1 2.766925 .7109139 3.89 0.000 1.372896 4.160955 itang1 1.893136 .3687716 5.13 0.000 1.170012 2.616259 mtb1 .1395775 .0310299 4.50 0.000 .078731 .200424 liq1 .3000688 .0442671 6.78 0.000 .2132655 .3868721 Included instruments: lnsale tang itang itangdum tax prof mtb capexsa ndts yr90 yr91 yr92 yr93 yr94 yr95 yr96 yr97 yr98 yr99 yr00 yr01 yr02 yr03 yr04 yr05 yr06 yr07 tang1 itang1 mtb1 liq1 F test of excluded instruments: F( 4, 2541) = 20.20 Prob > F = 0.0000 Angrist-Pischke multivariate F test of excluded instruments: F( 4, 2541) = 20.20 Prob > F = 0.0000 Summary results for first-stage regressions (Underid) (Weak id) Variable F( 4, 2541) P-val AP Chi-sq( 4) P-val AP F( 4, 2541) liq 20.20 0.0000 81.78 0.0000 20.20 NB: first-stage test statistics heteroskedasticity and autocorrelation-robust Stock-Yogo weak ID test critical values for single endogenous regressor: 5% maximal IV relative bias 16.85 10% maximal IV relative bias 10.27 20% maximal IV relative bias 6.71 30% maximal IV relative bias 5.34 10% maximal IV size 24.58 15% maximal IV size 13.96 20% maximal IV size 10.26 25% maximal IV size 8.31 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. Underidentification test Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified) Ha: matrix has rank=K1 (identified) Kleibergen-Paap rk LM statistic Chi-sq(4)=58.30 P-val=0.0000 Weak identification test Ho: equation is weakly identified Cragg-Donald Wald F statistic 78.65 Kleibergen-Paap Wald rk F statistic 20.20 Stock-Yogo weak ID test critical values for K1=1 and L1=4: 5% maximal IV relative bias 16.85 10% maximal IV relative bias 10.27 20% maximal IV relative bias 6.71 30% maximal IV relative bias 5.34 10% maximal IV size 24.58 15% maximal IV size 13.96 20% maximal IV size 10.26 25% maximal IV size 8.31 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. Weak-instrument-robust inference Tests of joint significance of endogenous regressors B1 in main equation Ho: B1=0 and orthogonality conditions are valid Anderson-Rubin Wald test F(4,2541)= 2.26 P-val=0.0607 Anderson-Rubin Wald test Chi-sq(4)= 9.14 P-val=0.0577 Stock-Wright LM S statistic Chi-sq(4)= 9.22 P-val=0.0557 NB: Underidentification, weak identification and weak-identification-robust test statistics heteroskedasticity and autocorrelation-robust Number of observations N = 3021 Number of regressors K = 28 Number of endogenous regressors K1 = 1 Number of instruments L = 31 Number of excluded instruments L1 = 4 2-Step GMM estimation Estimates efficient for arbitrary heteroskedasticity and autocorrelation Statistics robust to heteroskedasticity and autocorrelation kernel=Bartlett; bandwidth=2 time variable (t): year group variable (i): firm Number of obs = 3021 F( 28, 2544) = 3.02 Prob > F = 0.0000 Total (centered) SS = 21.06783592 Centered R2 = 0.0261 Total (uncentered) SS = 21.06783592 Uncentered R2 = 0.0261 Residual SS = 20.51803233 Root MSE = .08932 Robust ltdbv Coef. Std. Err. z P>z [95% Conf. Interval] liq -.0085538 .0049465 -1.73 0.084 -.0182487 .0011411 lnsale .0053743 .0052578 1.02 0.307 -.0049307 .0156794 tang .1170177 .0610377 1.92 0.055 -.0026139 .2366493 itang .0557467 .0239463 2.33 0.020 .0088127 .1026806 itangdum .0123551 .0065003 1.90 0.057 -.0003853 .0250955 tax -.0193497 .00924 -2.09 0.036 -.0374598 -.0012396 prof .0025405 .0027681 0.92 0.359 -.0028849 .0079659 mtb -.0019451 .0019992 -0.97 0.331 -.0058635 .0019733 capexsa .0108254 .0087886 1.23 0.218 -.0064 .0280507 ndts -.0022495 .0032416 -0.69 0.488 -.008603 .004104 yr90 -.0860865 .1693451 -0.51 0.611 -.4179968 .2458238 yr91 -.0057954 .0156291 -0.37 0.711 -.036428 .0248371 yr92 .0060493 .0148008 0.41 0.683 -.0229596 .0350583 yr93 -.0066494 .0154936 -0.43 0.668 -.0370163 .0237174 yr94 -.0038801 .0137634 -0.28 0.778 -.0308559 .0230956 yr95 -.0021814 .0139629 -0.16 0.876 -.0295482 .0251854 yr96 .007044 .0137418 0.51 0.608 -.0198895 .0339775 yr97 .0119441 .0134385 0.89 0.374 -.0143949 .0382831 yr98 .0069794 .013185 0.53 0.597 -.0188627 .0328216 yr99 .0132963 .0125952 1.06 0.291 -.0113898 .0379825 yr00 .0080221 .0119826 0.67 0.503 -.0154633 .0315074 yr01 -.0000815 .0107388 -0.01 0.994 -.0211291 .0209661 yr02 .0001449 .0106504 0.01 0.989 -.0207295 .0210193 yr03 .0106314 .0115621 0.92 0.358 -.0120299 .0332926 yr04 .0097052 .0102908 0.94 0.346 -.0104643 .0298748 yr05 .0156916 .0108831 1.44 0.149 -.0056388 .0370221 yr06 .0093837 .0108831 0.86 0.389 -.0119467 .0307142 yr07 .005672 .0086985 0.65 0.514 -.0113768 .0227207 Underidentification test (Kleibergen-Paap rk LM statistic): 58.301 Chi-sq(4) P-val = 0.0000 Weak identification test (Cragg-Donald Wald F statistic): 78.647 (Kleibergen-Paap rk Wald F statistic): 20.198 Stock-Yogo weak ID test critical values: 5% maximal IV relative bias 16.85 10% maximal IV relative bias 10.27 20% maximal IV relative bias 6.71 30% maximal IV relative bias 5.34 10% maximal IV size 24.58 15% maximal IV size 13.96 20% maximal IV size 10.26 25% maximal IV size 8.31 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. Hansen J statistic (overidentification test of all instruments): 5.596 Chi-sq(3) P-val = 0.1330 Instrumented: liq Included instruments: lnsale tang itang itangdum tax prof mtb capexsa ndts yr90 yr91 yr92 yr93 yr94 yr95 yr96 yr97 yr98 yr99 yr00 yr01 yr02 yr03 yr04 yr05 yr06 yr07 Excluded instruments: tang1 itang1 mtb1 liq1 Dropped collinear: yr08 . > > ________________________________________ > From: owner-statalist@hsphsun2.harvard.edu > [owner-statalist@hsphsun2.harvard.edu] on behalf of Schaffer, Mark E > [M.E.Schaffer@hw.ac.uk] > Sent: 25 June 2012 12:33 > To: statalist@hsphsun2.harvard.edu > Subject: st: RE: Interpreting Kleibergen Paap weak instrument > statistic > > James, > > > -----Original Message----- > > From: owner-statalist@hsphsun2.harvard.edu > > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of > > Fitzgerald, James > > Sent: 21 June 2012 14:02 > > To: statalist@hsphsun2.harvard.edu > > Subject: st: Interpreting Kleibergen Paap weak instrument statistic > > > > Hi Statalist users > > > > I am using xtivreg2 to estimate a GMM-IV model (I specify the > > following options; fe robust bw(2) gmm2s). I am not assuming i.i.d > > errors, and thus when testing for weak instruments I am using the > > Kleibergen Paap rk wald F statistic rather than the Cragg Donald > > wald F statistic. > > > > xtivreg2 produces Stock-Yogo critical values for the Cragg Donald > > statistic assuming i.i.d errors, so I'm not sure how to interpret > > the KP rk wald F stat. > > > > The help file for ivreg2 (Baum, Schaffer and Stillman, 2010) does > > however mention the following: > > > > When the i.i.d. assumption is dropped and ivreg2 is invoked with the > > robust, bw or cluster options, the Cragg-Donald-based weak > > instruments test is no longer valid. > > ivreg2 instead reports a correspondingly-robust Kleibergen-Paap Wald > > rk F statistic. The degrees of freedom adjustment for the rk > > statistic is (N-L)/L1, as with the Cragg-Donald F statistic, except > > in the cluster-robust case, when the adjustment is N/(N-1) * > > (N_clust-1)/N_clust, following the standard Stata small-sample > > adjustment for cluster-robust. In the case of two-way clustering, > > N_clust is the minimum of N_clust1 and N_clust2. The critical > > values reported by ivreg2 for the Kleibergen-Paap statistic are the > > Stock-Yogo critical values for the Cragg-Donald i.i.d. case. > > The critical values reported with 2-step GMM are the Stock-Yogo IV > > critical values, and the critical values reported with CUE are the > > LIML critical values. > > > > > > My understanding of the end of the paragraph is that the KP stat can > > still be compared to the Stock-Yogo values produced by STATA in > > determining whether or not instruments are weak. > > > > If someone could confirm or reject this I would be eternally > > grateful!! > > I wrote that paragraph, so the ambiguity is partly my fault. But the > problem is that there are no concrete results in the literature for > testing for weak IVs when the i.i.d. assumption fails. The only thing > one can do (that I'm aware of, anyway) is to point to stats that have > an asymptotic justification in a test of underidentification, which is > what the output of -ivreg2- does. That is, the K-P stat can be used > to test for underidentification without the i.i.d. assumption, and > under i.i.d. > it has the same distribution under the null as the Cragg-Donald stat. > This justification is different from that underlying the Stock-Yogo > critical values, so this is pretty hand-wavey. > > The alternative is weak-instrument-robust estimation, a la > Anderson-Rubin, Moreira, Kleibergen, etc. The Finlay-Magnusson > -rivtest- command, available via ssc ideas in the usual way, supports > this. Also see their accompanying SJ paper (vol. 9 no. 3). > The command > doesn't directly support panel data estimation, which is what you > have, but you could just demean your variables by hand. > > HTH, > Mark > > > > Best wishes > > > > James Fitzgerald > > * > > * For searches and help try: > > * http://www.stata.com/help.cgi?search > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ -- Heriot-Watt University is the Sunday Times Scottish University of the Year 2011-2012 Heriot-Watt University is a Scottish charity registered under charity number SC000278. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: RE: RE: Interpreting Kleibergen Paap weak instrument statistic***From:*"Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk>

**st: RE: RE: RE: Interpreting Kleibergen Paap weak instrument statistic***From:*"Fitzgerald, James" <J.Fitzgerald2@ucc.ie>

- Prev by Date:
**st: question regarding Syntax command** - Next by Date:
**Re: st: question regarding Syntax command** - Previous by thread:
**st: RE: RE: RE: Interpreting Kleibergen Paap weak instrument statistic** - Next by thread:
**st: RE: RE: RE: RE: Interpreting Kleibergen Paap weak instrument statistic** - Index(es):