Stata: Data Analysis and Statistical Software
   >> Home >> Resources & support >> FAQs >> Stata’s implementation of GEE

How does Stata’s implementation of GEE differ from other implementations?

Title   Stata’s implementation of GEE
Author James Hardin, StataCorp
Date January 1997; minor revisions July 2005

Stata’s command for GEE is xtgee. There are a few differences in Stata’s implementation of GEE from other packages. Below we describe those differences and, where appropriate, explain how to get the same answers as those provided by other packages.

Use of the scale parameter phi

Stata treats the scale parameter, phi, in the same way as GLM. For continuous distributions (Gaussian and gamma), the default is to set the scale parameter to the generalized chi-squared statistic divided by the degrees of freedom. For discrete distributions (binomial and Poisson), the scale parameter is set to one.

The scale parameter is a biproduct of the estimation of the GEE model and is estimated for all possible family-link combinations. For some of these models (such as binomial and Poisson), there is no such scale parameter in theory. For these cases, the scale parameter should be one. To ensure that it is, we multiply the resulting variance matrix by the estimated scale parameter and then set the estimate to one.

If you are trying to obtain results that match other packages, keep in mind that they may not offer this feature. If not, you may obtain the unscaled results from Stata by specifying the scale(phi) option.

Robust estimates of variance

This discussion applies only to the robust estimation results.

Stata scales the robust estimates of variance in the following way (which may differ from other implementations). In addition to scaling the scale parameter, Stata will do the following—where g is the number of groups (clusters), n is the number of observations, and k is the number of parameters in the model.

For the Gaussian family (linear regression) models, the robust variance matrix is multiplied by

          g         n − 1
        -----   x   ----- 
        g − 1       n − k

For all other families, the robust variance matrix is multiplied by

          g
        ----- 
        g − 1  

This rescaling of the robust variance matrix is standard in all of Stata when dealing with clustered robust estimation results. Many implementations will apply the scaling that we apply for all other families. The extra term in the scaling for the Gaussian family is to ensure better coverage probabilities (closer to nominal scale).

If you are trying to match results obtained from another package, you may need to get the variance–covariance matrix and undo the scaling that Stata applied to see the same results.

Bookmark and Share 
FAQs
What's new?
Statistics
Data management
Graphics
Programming Stata
Mata
Resources
Internet capabilities
Stata for Windows
Stata for Unix
Stata for Mac
Technical support
Like us on Facebook Follow us on Twitter Follow us on LinkedIn Google+ Watch us on YouTube
Follow us
© Copyright 1996–2013 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index   |   View mobile site