Dear list, I am using Stata/MP 11.2 for Unix (Linux 64-bit x86-64) Born 30 Mar 2011 I am running IV regressions using xtivreg2 (latest updated version) on a panel of 16 regions and 11 years. I estimate using regional fixed effects. My specification includes among the RHS variables year dummies, 6 continuous control variables, and the endogenous regressor of interest. I use 90 IVs to instrument my endogenous regressor, and I cluster standard errors at the regional level. The problem is the following: When I only include as RHS variables region and year dummies and the endogenous regressor, my first stage F-stat for the significance of excluded instruments goes to infinite. This is what I would expect given the degrees of freedom are "negative": F stat is distributed as F(k,d-k) Where k is the number of constraints (90 in my case, as I have 90 instruments to test), d is the number of clusters(16) When I add the additional six control variables: my first stage F-statistic is more than normal: 14.76. Do you think it is possible? I don't understand why this is happening only when I put controls in the regression. I have also tried to "partial out" some variables, as suggested in Baum, Schaffer and Stillman's paper("Enhanced routines for instrumental variables/generalized method of moments" The Stata Journal (2007), 7, Number 4, pp. 465-506) when the number of clusters is less than the number of exogenous regressors + excluded instruments. Partialling out some exogenous regressors helps the covariance matrix of orthogonality conditions to have full rank. Unfortunately, this still has not solved the problem as I have many instruments. Also, the Kleibergen-Paap Wald rk F statistic (which is the one suggested by the authors of the above paper in case of clustered errors) is reported as missing. I report the command used and the output of first stage statistics when I only control for year dummies using fixed effect estimator (xtivreg2 with fe and cluster() options) ---------------------------------------------------------------------------- --------------------------------------------------------------------------- xi: xtivreg2 netpay (share_reg = GWmean40_UK_* GWmean40_USA_* GWmean40_DE_* mean2004_40_UK_* mean2004_40_USA_* mean2004_40_DE_*) i.year if year>=1997&year<=2007 ,fe cluster(won) first ..... F test of excluded instruments: F( 90, 15) = 1.3e+13 Prob > F = 0.0000 Angrist-Pischke multivariate F test of excluded instruments: F( 90, 15) = 6.7e+12 Prob > F = 0.0000 Summary results for first-stage regressions ------------------------------------------- (Underid) (Weak id) Variable | F( 90, 15) P-val | AP Chi-sq( 90) P-val | AP F( 90, 15) share_reg | 1.3e+13 0.0000 | 1.5e+15 0.0000 | 6.7e+12 NB: first-stage test statistics cluster-robust ..... Underidentification test Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified) Ha: matrix has rank=K1 (identified) Kleibergen-Paap rk LM statistic Chi-sq(90)=. P-val= . Weak identification test Ho: equation is weakly identified Cragg-Donald Wald F statistic 2.99 Kleibergen-Paap Wald rk F statistic . ..... Weak-instrument-robust inference Tests of joint significance of endogenous regressors B1 in main equation Ho: B1=0 and orthogonality conditions are valid Anderson-Rubin Wald test F(0,15)= . P-val= . Anderson-Rubin Wald test Chi-sq(0)= . P-val= . Stock-Wright LM S statistic Chi-sq(0)= . P-val= . .... Number of clusters N_clust = 16 Number of observations N = 176 Number of regressors K = 11 Number of endogenous regressors K1 = 1 Number of instruments L = 100 Number of excluded instruments L1 = 0 ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- And this is the command and output with extra controls: ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- xi: xtivreg2 netpay public pop age sex shar_ed2 shar_ed3 (share_reg = GWmean40_UK_* GWmean40_USA_* GWmean40_DE_* mean2004_40_UK_* mean2004_40_USA_* mean2004_40_DE_*) i.year if year>=1997&year<=2007,fe cluster(won) first ....... F test of excluded instruments: F( 90, 15) = 14.73 Prob > F = 0.0000 Angrist-Pischke multivariate F test of excluded instruments: F( 90, 15) = 14.73 Prob > F = 0.0000 Summary results for first-stage regressions ------------------------------------------- (Underid) (Weak id) Variable | F( 90, 15) P-val | AP Chi-sq( 90) P-val | AP F( 90, 15) share_reg | 14.73 0.0000 | 3535.83 0.0000 | 14.73 NB: first-stage test statistics cluster-robust ....... Underidentification test Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified) Ha: matrix has rank=K1 (identified) Kleibergen-Paap rk LM statistic Chi-sq(90)=. P-val= . Weak identification test Ho: equation is weakly identified Cragg-Donald Wald F statistic 2.53 Kleibergen-Paap Wald rk F statistic . ...... Weak-instrument-robust inference Tests of joint significance of endogenous regressors B1 in main equation Ho: B1=0 and orthogonality conditions are valid Anderson-Rubin Wald test F(6,15)= 16.04 P-val=0.0000 Anderson-Rubin Wald test Chi-sq(6)= 256.57 P-val=0.0000 Stock-Wright LM S statistic Chi-sq(6)= . P-val= . NB: Underidentification, weak identification and weak-identification-robust test statistics cluster-robust Number of clusters N_clust = 16 Number of observations N = 176 Number of regressors K = 17 Number of endogenous regressors K1 = 1 Number of instruments L = 106 Number of excluded instruments L1 = 6 ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- Thanks for your consideration. Best regards, Anna Rosso * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

