Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: xtivreg2, clustered errors and F statistic |

Date |
Wed, 12 Oct 2011 00:37:29 +0100 |

Anna, > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Anna Rosso > Sent: 11 October 2011 23:09 > To: statalist@hsphsun2.harvard.edu > Subject: st: xtivreg2, clustered errors and F statistic > > Dear list, > > I am using Stata/MP 11.2 for Unix (Linux 64-bit x86-64) Born > 30 Mar 2011 > > I am running IV regressions using xtivreg2 (latest updated > version) on a panel of 16 regions and 11 years. > > I estimate using regional fixed effects. My specification > includes among the RHS variables year dummies, 6 continuous > control variables, and the endogenous regressor of interest. > I use 90 IVs to instrument my endogenous regressor, and I > cluster standard errors at the regional level. Here are the counts from your estimation without controls: Number of clusters N_clust = 16 Number of observations N = 176 Number of regressors K = 11 Number of endogenous regressors K1 = 1 Number of instruments L = 100 Number of excluded instruments L1 = 0 Things are going very badly wrong, and you may in fact have found a bug in xtivreg2 or ivreg2 - you somehow have 1 endogenous regressor but no excluded instruments! Please send me privately your full output of this specification and I will try to trace what happened. But you are unlikely to get sensible results with a setup like this in any case. You have 176 observations, 100 instruments and 1 endogenous variable, so the degree of overidentification is very high compared to the sample size. The finite-sample bias of the IV estimator is increasing in the degree of overidentification, so your estimated coefficient is likely to be badly biased. And that's not even considering the problem that you have only 16 clusters. For the cluster-robust VCV to be consistent, the number of clusters has to go off to infinity, and 16 isn't very far on the way to infinity. Also, as you note, you cannot test for weak or underidentification, because the rank of the first-stage VCV will be only 16, so you can't test the joint significance of the 100 instruments in the first-stage regression. --Mark > > The problem is the following: > When I only include as RHS variables region and year dummies > and the endogenous regressor, my first stage F-stat for the > significance of excluded instruments goes to infinite. This > is what I would expect given the degrees of freedom are > "negative": F stat is distributed as F(k,d-k) Where k is the > number of constraints (90 in my case, as I have 90 > instruments to test), d is the number of clusters(16) > > When I add the additional six control variables: my first > stage F-statistic is more than normal: 14.76. > Do you think it is possible? I don't understand why this is > happening only when I put controls in the regression. > > I have also tried to "partial out" some variables, as > suggested in Baum, Schaffer and Stillman's paper("Enhanced > routines for instrumental variables/generalized method of > moments" The Stata Journal (2007), 7, Number 4, pp. 465-506) > when the number of clusters is less than the number of > exogenous regressors + excluded instruments. Partialling out > some exogenous regressors helps the covariance matrix of > orthogonality conditions to have full rank. Unfortunately, > this still has not solved the problem as I have many instruments. > Also, the Kleibergen-Paap Wald rk F statistic (which is the > one suggested by the authors of the above paper in case of > clustered errors) is reported as missing. > > I report the command used and the output of first stage > statistics when I only control for year dummies using fixed > effect estimator (xtivreg2 with fe and cluster() options) > -------------------------------------------------------------- > -------------- > -------------------------------------------------------------- > ------------- > xi: xtivreg2 netpay (share_reg = GWmean40_UK_* > GWmean40_USA_* GWmean40_DE_* > mean2004_40_UK_* mean2004_40_USA_* mean2004_40_DE_*) i.year if > year>=1997&year<=2007 ,fe cluster(won) first > > ..... > > F test of excluded instruments: > F( 90, 15) = 1.3e+13 > Prob > F = 0.0000 > Angrist-Pischke multivariate F test of excluded instruments: > F( 90, 15) = 6.7e+12 > Prob > F = 0.0000 > > Summary results for first-stage regressions > ------------------------------------------- > > (Underid) > (Weak id) > Variable | F( 90, 15) P-val | AP Chi-sq( 90) P-val | AP F( 90, > 15) > share_reg | 1.3e+13 0.0000 | 1.5e+15 0.0000 | > 6.7e+12 > > NB: first-stage test statistics cluster-robust > > ..... > > Underidentification test > Ho: matrix of reduced form coefficients has rank=K1-1 > (underidentified) > Ha: matrix has rank=K1 (identified) > Kleibergen-Paap rk LM statistic Chi-sq(90)=. > P-val= . > > Weak identification test > Ho: equation is weakly identified > Cragg-Donald Wald F statistic > 2.99 > Kleibergen-Paap Wald rk F statistic > . > ..... > > Weak-instrument-robust inference > Tests of joint significance of endogenous regressors B1 in > main equation > Ho: B1=0 and orthogonality conditions are valid > Anderson-Rubin Wald test F(0,15)= . > P-val= . > Anderson-Rubin Wald test Chi-sq(0)= . > P-val= . > Stock-Wright LM S statistic Chi-sq(0)= . > P-val= . > > .... > > Number of clusters N_clust = 16 > Number of observations N = 176 > Number of regressors K = 11 > Number of endogenous regressors K1 = 1 > Number of instruments L = 100 > Number of excluded instruments L1 = 0 > -------------------------------------------------------------- > -------------- > -------------------------------------------------------------- > -------------- > And this is the command and output with extra controls: > > -------------------------------------------------------------- > -------------- > -------------------------------------------------------------- > -------------- > xi: xtivreg2 netpay public pop age sex shar_ed2 shar_ed3 (share_reg = > GWmean40_UK_* GWmean40_USA_* GWmean40_DE_* mean2004_40_UK_* > mean2004_40_USA_* mean2004_40_DE_*) i.year if year>=1997&year<=2007,fe > cluster(won) first > > ....... > > F test of excluded instruments: > F( 90, 15) = 14.73 > Prob > F = 0.0000 > Angrist-Pischke multivariate F test of excluded instruments: > F( 90, 15) = 14.73 > Prob > F = 0.0000 > > Summary results for first-stage regressions > ------------------------------------------- > > (Underid) > (Weak id) > Variable | F( 90, 15) P-val | AP Chi-sq( 90) P-val | AP F( 90, > 15) > share_reg | 14.73 0.0000 | 3535.83 0.0000 | > 14.73 > > NB: first-stage test statistics cluster-robust > > ....... > > Underidentification test > Ho: matrix of reduced form coefficients has rank=K1-1 > (underidentified) > Ha: matrix has rank=K1 (identified) > Kleibergen-Paap rk LM statistic Chi-sq(90)=. > P-val= . > > Weak identification test > Ho: equation is weakly identified > Cragg-Donald Wald F statistic > 2.53 > Kleibergen-Paap Wald rk F statistic > . > > ...... > > Weak-instrument-robust inference > Tests of joint significance of endogenous regressors B1 in > main equation > Ho: B1=0 and orthogonality conditions are valid > Anderson-Rubin Wald test F(6,15)= 16.04 > P-val=0.0000 > Anderson-Rubin Wald test Chi-sq(6)= 256.57 > P-val=0.0000 > Stock-Wright LM S statistic Chi-sq(6)= . > P-val= . > > NB: Underidentification, weak identification and > weak-identification-robust > test statistics cluster-robust > > Number of clusters N_clust = 16 > Number of observations N = 176 > Number of regressors K = 17 > Number of endogenous regressors K1 = 1 > Number of instruments L = 106 > Number of excluded instruments L1 = 6 > -------------------------------------------------------------- > -------------- > -------------------------------------------------------------- > -------------- > > > Thanks for your consideration. > > Best regards, > > Anna Rosso > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Heriot-Watt University is a Scottish charity registered under charity number SC000278. Heriot-Watt University is the Sunday Times Scottish University of the Year 2011-2012 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: xtivreg2, clustered errors and F statistic***From:*"Anna Rosso" <a.rosso@ucl.ac.uk>

- Prev by Date:
**Re: st: Assistance with manipulating a social network dataset?** - Next by Date:
**st: location-industry averages** - Previous by thread:
**st: xtivreg2, clustered errors and F statistic** - Next by thread:
**st: location-industry averages** - Index(es):