Home  /  Resources & support  /  FAQs  /  Comparing xtgls with regress, vce(cluster)

How does xtgls differ from regression clustered with robust standard errors?

Title   Comparing xtgls with regress, vce(cluster)
Author Vince Wiggins, StataCorp

Someone asked about the difference between estimation by xtgls under the assumption that the data are panel-level heteroskedastic and estimation by OLS with the vce(cluster) option (see manual regress).


Does anyone know what the different assumptions are that Stata is making when you ask it to do xtgls with heteroskedastic panels or regress, vce(robust) with clusters?

I am trying to estimate correct errors for a panel dataset that I am assuming is groupwise heteroskedastic. Both of the above seem to be appropriate commands, but they give different errors (and estimates), and I am not sure which one is correct.


Both are fine estimates given the panel-heteroskedastic assumption. If the assumption is correct, the xtgls estimates are more efficient and so would be preferred. If the covariances within panel are different from simply being panel heteroskedastic, on the other hand, then the xtgls estimates will be inefficient and the reported standard errors will be incorrect.

The regress, vce(cluster) estimates, on the other hand, will be correct in either case and are never fully efficient. That is, the regress, vce(cluster) coefficients will be consistent, and the standard errors will provide correct coverage rates for hypothesis tests.


xtgls will estimate a model by feasible generalized least squares (FGLS) under the assumption that all aspects of the model are completely specified. Here that includes that the disturbances have different variances for each panel and are constant within panel. Under these assumptions, FGLS is asymptotically efficient and if iterated (option igls) will produce maximum likelihood estimates of the parameters.

If, however, the assumptions are NOT correct, the standard errors will NOT be correct and usually will be anticonservative.

regress, vce(cluster)

regress ..., vce(cluster) estimates the model by OLS but uses the linearization/Huber/White/sandwich (robust) estimates of variance (and thus standard errors). These variance estimates are robust in the sense of providing correct coverage rates to much more than panel-level heteroskedasticity. In particular, they are robust to any type of correlation within the observations of each panel/group.

Differing asymptotic properties

Let’s return to the question of datasets. Most data that one would analyze by xtgls have many periods and few panels—say, 3 to 25 panels for argument. Because we are estimating variance parameters for each panel (or possibly covariances between panels), the estimates require many periods per panel for consistency.

Conversely, the clustered-robust estimator treats each cluster as a superobservation for part of its contribution to the variance estimate (see [P] _robust). In general, we want many clusters/panels when using this method. If we do not have many clusters, the rank of the resulting variance matrix may be smaller than the number of parameters in the model.

Think of a dataset as having n*T observations, where n is the number of panels and T the average number of observations per panel. To use xtgls, T needs to be large. To use regress, vce(cluster), n needs to be large. To consider using both, both need to be large.


We may also want to consider using panel-corrected standard errors (option xtpcse) for estimating variances in such models. These variance estimates return to the assumption of many observations per panel but allow for panel-level heteroskedasticity and contemporaneous correlation of observations between the panels. As with regress, vce(cluster), xtpcse uses OLS parameter estimates, which are consistent but inefficient. xtpcse is in some sense the opposite of regress, vce(cluster)—no within-panel correlation is allowed, only correlation among observations at the same period and in different panels. xtcpse will allow one specific form of within-panel correlation; if the correlation() option is specified, the model is estimated by Prais–Winsten FGLS assuming an AR1 process in the disturbances.

If we type xtpcse ..., we will get OLS parameter estimates along with the panel-corrected variance estimates.