Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Antw: Re: st: Clustered standard errors: Insufficient observations

From   "Jost Heckemeyer" <>
To   <>
Subject   Antw: Re: st: Clustered standard errors: Insufficient observations
Date   Sat, 14 May 2011 20:45:33 +0200

The "generally recommened" referred to the example I gave, i.e. I have
cross-country data where several 100s and 1000s of firms are clustered
within several dozens of countries, so clustering and the country level
would indeed make sense, as you write it, Stas.  Still, it does not work
although and I cannot see any reason why it doesn t. 

>>> Stas Kolenikov  14.05.11 14.22 Uhr >>>
On Sat, May 14, 2011 at 12:30 PM, Jost Heckemeyer  wrote:
> Dear Statalisters,
> I want to estimate a large cross-country panel model (> 50.000 firm
> observations). As some of my main explanatory variables vary mainly at
> the country-level (e.g. tax rates) I cluster standard errors within
> countries, not within firms - as it is generally recommended to do.
> However, as soon as I estimate a firm fixed effects model (xtreg, fe
> with option cluster(country), fe are at the firm level) it does not
> anymore and it just gives me the error "insufficient observations".
> xtreg, re and all pooled estimators all work well with
> xtreg, fe also works with clustering within firms. But this is not
> I want. It would be great if anyone coule help me with this problem.
> What can I do?

Clustered standard errors are intended to work at the highest level of
your sampling, or the highest level at which you expect correlations
in the error terms (because of unobserved or omitted variables, say).
I'd be curious to see as to who "generally recommends" clustering at
the level at which the explanatory variables vary; if this were a good
claim, you would have to cluster on gender in any labor market
regression, leaving you with 2 d.f.s to estimate at most 1 parameter
besides the intercept. If you sampled 3 countries from a list of
developing countries, and then 10K firms within country, then you
would want to cluster at the level of the countries (although it won't
produce reasonable results, since you need at least several dozen
clusters to get sensible performance). If you had three countries
because that's where your collaborators have been, the country
dimension has nothing to do with sampling, and the legitimate sampling
units would be firms (unless of course you sampled industry codes
first, in which case you would need to cluster by the industry codes).
If you are concerned about lack of control over your country
dimension, you could  specify interactions of your explanatory
variables (may be a subset of them) with the country variable using
something like i.ownership).

Stas Kolenikov, also found at
Small print: I use this email account for mailing lists only.
*   For searches and help try:

Zentrum für Europäische Wirtschaftsforschung GmbH (ZEW), L7,1 68161
Sitz der Gesellschaft: Mannheim Amtsgericht Mannheim HRB 6554
Aufsichtsratsvorsitzender: Gerhard Stratthaus MdL, Finanzminister a.D.
Geschaeftsfuehrer: Prof. Dr. Dr. h.c. mult. Wolfgang Franz, Thomas Kohl
Centre for European Economic Research L7,1 68161 Mannheim Germany
Seat of the Company: Mannheim Local Court Mannheim HRB 6554
Chairman of the Supervisory Board: Gerhard Stratthaus MdL, Minister,
Executive Directors: Prof. Dr. Dr. h.c. mult. Wolfgang Franz, Thomas
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index