Stata | FAQ: Comparing xtgls with regress, vce(cluster)

Home / Resources & support / FAQs / Comparing xtgls with regress, vce(cluster)

How does xtgls differ from regression clustered with robust standard errors?

Title		Comparing xtgls with regress, vce(cluster)
Author		Vince Wiggins, StataCorp

Someone asked about the difference between estimation by xtgls under the assumption that the data are panel-level heteroskedastic and estimation by OLS with the vce(cluster) option (see manual regress).

Question

Does anyone know what the different assumptions are that Stata is making when you ask it to do xtgls with heteroskedastic panels or regress, vce(robust) with clusters?

I am trying to estimate correct errors for a panel dataset that I am assuming is groupwise heteroskedastic. Both of the above seem to be appropriate commands, but they give different errors (and estimates), and I am not sure which one is correct.

Answer

Both are fine estimates given the panel-heteroskedastic assumption. If the assumption is correct, the xtgls estimates are more efficient and so would be preferred. If the covariances within panel are different from simply being panel heteroskedastic, on the other hand, then the xtgls estimates will be inefficient and the reported standard errors will be incorrect.

The regress, vce(cluster) estimates, on the other hand, will be correct in either case and are never fully efficient. That is, the regress, vce(cluster) coefficients will be consistent, and the standard errors will provide correct coverage rates for hypothesis tests.

xtgls

xtgls will estimate a model by feasible generalized least squares (FGLS) under the assumption that all aspects of the model are completely specified. Here that includes that the disturbances have different variances for each panel and are constant within panel. Under these assumptions, FGLS is asymptotically efficient and if iterated (option igls) will produce maximum likelihood estimates of the parameters.

If, however, the assumptions are NOT correct, the standard errors will NOT be correct and usually will be anticonservative.

regress, vce(cluster)

regress ..., vce(cluster) estimates the model by OLS but uses the linearization/Huber/White/sandwich (robust) estimates of variance (and thus standard errors). These variance estimates are robust in the sense of providing correct coverage rates to much more than panel-level heteroskedasticity. In particular, they are robust to any type of correlation within the observations of each panel/group.

Differing asymptotic properties

Let’s return to the question of datasets. Most data that one would analyze by xtgls have many periods and few panels—say, 3 to 25 panels for argument. Because we are estimating variance parameters for each panel (or possibly covariances between panels), the estimates require many periods per panel for consistency.

Conversely, the clustered-robust estimator treats each cluster as a superobservation for part of its contribution to the variance estimate (see [P] _robust). In general, we want many clusters/panels when using this method. If we do not have many clusters, the rank of the resulting variance matrix may be smaller than the number of parameters in the model.

Think of a dataset as having n*T observations, where n is the number of panels and T the average number of observations per panel. To use xtgls, T needs to be large. To use regress, vce(cluster), n needs to be large. To consider using both, both need to be large.

xtpcse

We may also want to consider using panel-corrected standard errors (option xtpcse) for estimating variances in such models. These variance estimates return to the assumption of many observations per panel but allow for panel-level heteroskedasticity and contemporaneous correlation of observations between the panels. As with regress, vce(cluster), xtpcse uses OLS parameter estimates, which are consistent but inefficient. xtpcse is in some sense the opposite of regress, vce(cluster)—no within-panel correlation is allowed, only correlation among observations at the same period and in different panels. xtcpse will allow one specific form of within-panel correlation; if the correlation() option is specified, the model is estimated by Prais–Winsten FGLS assuming an AR1 process in the disturbances.

If we type xtpcse ..., we will get OLS parameter estimates along with the panel-corrected variance estimates.

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

How does xtgls differ from regression clustered with robust standard errors?

Question

Answer

xtgls

regress, vce(cluster)

Differing asymptotic properties

xtpcse

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

How does xtgls differ from regression clustered with robust standard errors?

Question

Answer

xtgls

regress, vce(cluster)

Differing asymptotic properties

xtpcse

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies