How does Stata’s implementation of GEE differ from other implementations?
| Title |
|
Stata’s implementation of GEE |
| Author |
James Hardin, StataCorp |
| Date |
January 1997; minor revisions July 2005 |
Stata’s command for GEE is
xtgee. There are
a few differences in Stata’s implementation of GEE from other
packages. Below we describe those differences and, where appropriate,
explain how to get the same answers as those provided by other packages.
Use of the scale parameter phi
Stata treats the scale parameter, phi, in the same way as GLM. For
continuous distributions (Gaussian and gamma), the default is to set the
scale parameter to the generalized chi-squared statistic divided by the
degrees of freedom. For discrete distributions (binomial and Poisson), the
scale parameter is set to one.
The scale parameter is a biproduct of the estimation of the GEE model and is
estimated for all possible family-link combinations. For some of these
models (such as binomial and Poisson), there is no such scale parameter in
theory. For these cases, the scale parameter should be one. To ensure that
it is, we multiply the resulting variance matrix by the estimated scale
parameter and then set the estimate to one.
If you are trying to obtain results that match other packages, keep in mind
that they may not offer this feature. If not, you may obtain the unscaled
results from Stata by specifying the scale(phi) option.
Robust estimates of variance
This discussion applies only to the robust estimation results.
Stata scales the robust estimates of variance in the following way (which
may differ from other implementations). In addition to scaling the
scale parameter, Stata will do the following—where g is the number of
groups (clusters), n is the number of observations, and k is the number of
parameters in the model.
For the Gaussian family (linear regression) models, the robust variance
matrix is multiplied by
g n − 1
----- x -----
g − 1 n − k
For all other families, the robust variance matrix is multiplied by
g
-----
g − 1
This rescaling of the robust variance matrix is standard in all of Stata
when dealing with clustered robust estimation results. Many implementations
will apply the scaling that we apply for all other families. The extra term
in the scaling for the Gaussian family is to ensure better coverage
probabilities (closer to nominal scale).
If you are trying to match results obtained from another package, you may
need to get the variance–covariance matrix and undo the scaling that
Stata applied to see the same results.
|