Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Clustered standard errors on the region * year level (-xtreg-)

From   "Tobias Pfaff" <[email protected]>
To   <[email protected]>
Subject   Re: st: Clustered standard errors on the region * year level (-xtreg-)
Date   Fri, 16 Sep 2011 16:43:09 +0200

Dear Austin,

I can't collect more regions (Western Germany, our focus, only has 10
states), and the German Federal Statistical Office doesn't provide GDP per
capita on a finer grid for the last 26 years.

My coefficient for GDP per capita is not significant, even without
clustering. So if I cluster on region with too few regions, I can assume
that there is a downward bias in the standard errors. Without the downward
bias the standard errors would be even larger and the coefficient even more

I guess that I would cluster on region in this case and would argue as above
concerning the coefficient of GDP per capita.


-----Ursprüngliche Nachricht-----
> Date: Fri, 16 Sep 2011 10:01:27 -0400
> Subject: Re: st: Clustered standard errors on the region * year level
> From: Austin Nichols <[email protected]>
> To: [email protected]

Tobias Pfaff <[email protected]>:
Yes, you need to cluster on region to allow for arbitrary correlation
with region over time, not region_year, but you have too few regions
to expect the downward bias in the cluster-robust SE to be negligible.
 Collect more regions.  Or "GDP per capita" on a finer grid.  Or try
to model serial correlation within region, rather than adopt a robust
method which requires more clusters.  A sensible strategy is to try a
few plausible models and pick the one that gives the largest SEs,
since we (researchers) invariably underestimate variability of

On Fri, Sep 16, 2011 at 9:53 AM, Tobias Pfaff
<[email protected]> wrote:
> Hi,
> I do a fixed effects regression and wonder how I should cluster the
> errors.
> Dependent variable: individual level
> Independent variables: individual level, GDP per capita on the regional
> level
> No. of regions: 10
> No. of individuals: 30,000
> No. of years: 26
> No. of obs.: 304,000
> Year dummies: yes
> Region dummies: yes
> Since one of my independent variables is aggregated at a higher level than
> the dependent variable I cluster on the region*year level (260 clusters):
> -xtreg depvar indepvars, fe vce(cluster region_year) nonest dfadj-
> ["region_year" was created with -egen region_year = group(region year)-]
> This works fine, but I'm not sure if the combination of region*year as
> definition of a cluster is OK with the fixed effects model, especially
> I include region and year dummies as well?
> Clustering only one the regional level would result in 10 clusters, which
> too few when the number of clusters has to go to infinity for the
> vce(cluster) estimation to work. Right?
> Any help is greatly appreciated!
> Thanks,
> Tobias

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index