Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Problem with IV regression and two-way clustering |

Date |
Fri, 28 Sep 2012 16:15:13 +0100 |

Tobias, The cluster-robust approach is nonparametric in the sense that the VCE is robust to arbitrary within-cluster correlation. That's fine if you've got enough clusters to be reasonably happy that the asymptotics kick in, but I don't think you do. A parametric approach means that instead of allowing for arbitrary within-cluster correlation, you model and estimate it. In your case, for example, you might estimate the intra-class correlations and then use the "Moulton factor" (a.k.a. the "design effect") to adjust the SEs. Angrist & Pischke's Mostly Harmless Econometrics (2009, chapter 8) has a good discussion. Steve Pischke's website has an ungated extract here: http://econ.lse.ac.uk/staff/spischke/mhe/ex_ch8.pdf HTH, Mark > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of > Tobias Pfaff > Sent: Friday, September 28, 2012 3:11 PM > To: statalist@hsphsun2.harvard.edu > Subject: RE: st: Problem with IV regression and two-way clustering > > Thanks Mark. > But what do you mean by "parametric approach"? > > Regards, > Tobias > > > > -----Original Message----- > > From: owner-statalist@hsphsun2.harvard.edu > > [mailto:owner-statalist@hsphsun2.harvard.edu] "Schaffer, Mark E" > <M.E.Schaffer@hw.ac.uk> > > Sent: Fri, 28 Sep 2012 12:23:38 +0100 > > To: statalist@hsphsun2.harvard.edu > > Subject: Re: st: Problem with IV regression and two-way clustering > > > Tobias, > > > My reaction is that 14 clusters is too small. Consistency of the > > cluster-robust VCE requires the number of clusters to go to > infinity, > > and 14 is just not very far on the way to infinity. You > note that with > > a small number of clusters, the SEs are biased downwards, but the > > problem isn't just bias - you are going to get noisy > estimates of the > > SEs, i.e., in repeated samples with 14 clusters they can be > all over the > > place. > > > You might instead want to investigate a parametric approach to the > > problem...? > > > HTH, > > Mark > > > -----Original Message----- > > From: owner-statalist@hsphsun2.harvard.edu > > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of > > Tobias Pfaff > > Sent: Thursday, September 27, 2012 9:30 PM > > To: statalist@hsphsun2.harvard.edu > > Subject: Re: st: Problem with IV regression and two-way clustering > > > > Dear Austin, > > > > Yes, some individuals move across regions. > > If I do the IV regression with two-way clustering, I just > > find it strange > > that the tests point to an invalid instrument, given the rather high > > correlation of the focus variable and the instrument. > > > > Regards, > > Tobias > > > > ________________________________________ > > From Austin Nichols <austinnichols@gmail.com> > > To statalist@hsphsun2.harvard.edu > > Subject Re: st: Problem with IV regression and two-way clustering > > Date Thu, 27 Sep 2012 16:03:38 -0400 > > ________________________________________ > > > > Are individuals moving across regions? If not, the pid clustering is > > subsumed in region, and you need only cluster at the region level. > > You might consider 2-d clustering by region and year as well. > > Clustering by pid is not enough; you have strong > correlation of errors > > and predictors within region across people. > > > > On Thu, Sep 27, 2012 at 3:29 PM, Tobias Pfaff > > <tobias.pfaff@uni-muenster.de> wrote: > > > Dear Statalisters, > > > > > > I would kindly ask you for comments on an instrumental-variables > > regression > > > with (two-way) clustered standard errors, which is a > > challenge for me. > > > I'm afraid that the whole problem cannot be written in just > > a few lines. > > > Below is the whole story (which is hopefully interesting to > > some of you). > > > > > > Any help is greatly appreciated! > > > > > > Now the setting: > > > > > > Unbalanced individual panel data set, single country > > > Obs.: 170,000 > > > Individuals: 28,000 > > > Regions: 14 > > > Years: 9 > > > Dependent variable measured on the individual level > > > Independent variable of interest (focusvar) measured on the > > regional level > > > Further control variables: 10, all at the individual level, > > plus region > > and > > > year dummies (20 dummies) > > > > > > I use individual fixed effects and I cluster on the > > individual level to > > > control for correlation of the errors over time and get the > > result that my > > > focus variable is significant: > > > -xtivreg2 depvar focusvar controlvars, fe cluster(pid)- > > > > > > My focus variable is aggregated at a higher level > (region) than the > > > dependent variable (individual), and I know from Moulton > > (1990) that my > > > standard errors can be biased downwards dramatically if I > > do not cluster > > at > > > the regional level. Additionally, Donald and Lang (2007) > > say that without > > > clustering on the regional level, I dramatically overstate the > > significance > > > of the coefficients. Therefore, I use two-way clustering on > > the individual > > > and on the regional level: > > > -xtivreg2 depvar focusvar controlvars, fe cluster(pid region)- > > > > > > Now my focus variable is insignificant. However, the number > > of clusters is > > > small (14), which again leads to biased results (Donald and > > Lang 2007). > > > Cameron et al. (2011) tell me that "With a small number of > > clusters the > > > cluster-robust standard errors are downwards biased" (p. > > 414). Since my > > > focus variable is already insignificant, I would expect the > > coefficient to > > > be even more insignificant, if I would correct for the bias > > induced by the > > > small number of clusters, and I conclude that I find no > evidence for > > > significance. > > > > > > Now comes the challenge (as if it has not yet been enough): > > > I want to do an IV regression to make sure that my results are not > > > influenced by endogeneity bias. I found a variable on the > > regional level > > > which is theoretically a fine instrument for my regional > > focus variable. > > The > > > correlation between the focus variable and the instrument is .60. > > > > > > I now estimate the IV model with two-way clustered > standard errors: > > > -xtivreg2 depvar (focusvar = instrumentvar) controlvars, fe > > cluster(pid > > > region) first- > > > > > > The size of the coefficient of my focus variable has > decreased. The > > standard > > > errors have increased drastically, and the coefficient is > by far not > > > significant. In the first-stage regression, the instrument is not > > > significant. The tests say that the instrument is weak and > > I cannot reject > > > the null of underidentification. I interpret this as > > evidence that I have > > a > > > bad instrument or that my focus variable is not endogenous. > > > > > > However, a different picture appears when I only cluster at > > the individual > > > level: > > > -xtivreg2 depvar (focusvar = instrumentvar) controlvars, fe > > cluster(pid) > > > first- > > > > > > The standard errors of my focus variable are still much > > larger than the > > > non-IV estimates, but smaller compared to IV with two-way > > clustering. The > > > focus variable is again not significant. The instrument is highly > > > significant in the first-stage regression. The tests > > indicate that the > > > hypotheses of a weak instrument and of underidentification can be > > rejected. > > > I would interpret this as evidence that my instrument is > > valid and that my > > > focus variable is endogenous. > > > > > > Conclusion: > > > My interpretation is that the results generally suggest > > that my focus > > > variable is not significant. > > > > > > Open questions: > > > Is my interpretation wrong? > > > Is my instrument good or bad - should I trust the results > > from the one-way > > > or two-way clustering for the IV approach? > > > In case I want to cluster on the regional level and correct > > for the bias > > due > > > to a small number of clusters, I could use > > wild-bootstrapping as proposed > > by > > > Cameron et al. (2011), but does that work for IV as well? > > > > > > Thanks very much for any clarification, > > > Tobias > > > > > > Cited literature: > > > Cameron, Gelbach, Miller (2008), Bootstrap-Based Improvements for > > Inference > > > with Clustered Errors. The Review of Economics and > > Statistics, 90 (3), > > > 414-427. > > > Donald, Lang (2007), Inference with > > Difference-in-Differences and Other > > > Panel Data. The Review of Economics and Statistics, 89 > (2), 221-233. > > > Moulton (1990), An Illustration of a Pitfall in Estimating > > the Effects of > > > Aggregate Variables on Micro Units. The Review of Economics and > > Statistics, > > > 72 (2), 334-338. > > > > > > * > > * For searches and help try: > > * http://www.stata.com/help.cgi?search > > * http://www.stata.com/support/faqs/resources/statalist-faq/ > > * http://www.ats.ucla.edu/stat/stata/ > > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > -- Heriot-Watt University is the Sunday Times Scottish University of the Year 2011-2012 We invite research leaders and ambitious early career researchers to join us in leading and driving research in key inter-disciplinary themes. Please see www.hw.ac.uk/researchleaders for further information and how to apply. Heriot-Watt University is a Scottish charity registered under charity number SC000278. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**RE: st: Problem with IV regression and two-way clustering***From:*"Tobias Pfaff" <tobias.pfaff@uni-muenster.de>

- Prev by Date:
**RE: st: command for penalized MLE using a complex survey data?** - Next by Date:
**st: how recode a positive value on a variable to a particular negative value** - Previous by thread:
**RE: st: Problem with IV regression and two-way clustering** - Next by thread:
**st: Transform logit coef and use in -estout- -esttab-** - Index(es):