Home  /  Resources & support  /  FAQs  /  Reference for cluster-correlated robust variance calculation

Which references should I cite when using the vce(cluster clustvar) option to obtain Stata’s cluster-correlated robust estimate of variance?

Title   Citing references for Stata’s cluster-correlated robust variance estimates
Author Roberto Gutierrez, StataCorp
David M. Drukker, StataCorp


In performing my statistical analysis, I have used Stata’s _____ estimation command with the vce(cluster clustvar) option to obtain a robust variance estimate that adjusts for within-cluster correlation. A journal referee now asks that I give the appropriate reference for this calculation. Which references should I cite?

Short answer

Rogers, W. H. 1993.
Regression standard errors in clustered samples. Stata Technical Bulletin 13: 19–23. Reprinted in Stata Technical Bulletin Reprints, vol. 3, 88–94.
(A PDF of this article can be found here.)
Williams, R. L. 2000.
A note on robust variance estimation for cluster-correlated data. Biometrics 56: 645–646.
Wooldridge, J. M. 2002.
Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press.
Froot, K. A. 1989.
Consistent covariance matrix estimation with cross-sectional dependence and heteroskedasticity in financial data. Journal of Financial and Quantitative Analysis 24: 333–355.

Long answer

Most of Stata’s estimation commands provide the vce(robust) option. By specifying vce(robust), one may forgo model-based variance estimates in favor of the more model-agnostic “robust” variances. Robust variances give accurate assessments of the sample-to-sample variability of the parameter estimates even when the model is misspecified. The robust variance comes under various names and within Stata is known as the Huber/White/sandwich estimate of variance. The names Huber and White refer to the seminal references for this estimator:

Huber, P. J. 1967.
The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, CA: University of California Press, vol. 1, 221–233.

White, H. 1980.
A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48: 817–830.

The name “sandwich” refers to the mathematical form of the estimate, namely, that it is calculated as the product of three matrices: the matrix formed by taking the outer product of the observation-level likelihood/pseudolikelihood score vectors is used as the middle of these matrices (the meat of the sandwich), and this matrix is in turn pre- and postmultiplied by the usual model-based variance matrix (the bread of the sandwich).

Huber (1967) and White (1980), however, do not deal with clustering. When you have clustering, the observations within cluster may not be treated as independent, but the clusters themselves are independent. Here, the robust calculation is straightforwardly generalized by substituting the meat of the sandwich with a matrix formed by taking the outer product of the cluster-level scores, where within each cluster the cluster-level score is obtained by summing the observation-level scores. See Rogers (1993) and [P] _robust for details.

This generalization for clustering is, in fact, so “straightforward” that it has for a long time (until Froot [1989]) remained undocumented in the literature. In fact, Williams (2000) is simply a short note that comments on this fact and gives a short proof of the validity of the estimator:

This brief note presents a general proof that the [modified-sandwich] estimator is unbiased for cluster-correlated data regardless of the setting. The result is not new, but a simple and general reference is not readily available.

The above hints that Froot (1989) may be little known outside the econometrics community and Rogers (1993) is little known among non-Stata users. Those requiring a reference from a refereed journal can therefore cite Froot (1989) as the seminal reference or Williams (2000) for its direct statement of this result. Those wanting a reference for how the calculation is actually performed in Stata can use Rogers (1993). Also, those wanting a textbook proof can cite Wooldridge (2002, sec. 13.8.2).

Finally, although White did not explicitly consider cluster sampling, he did address the finitely correlated case in his 1984 and 1994 books. The results for cluster analysis can also be derived from the results in section 8.3 of White (1994).

More references

White, H. 1984.
Asymptotic Theory for Econometricians. Orlando, FL: Academic Press.
White, H. 1994.
Estimation, Inference and Specification Analysis. New York: Cambridge University Press.