Stata | FAQ: Comparing RE and PA models

Home / Resources & support / FAQs / Comparing RE and PA models

What is the difference between random-effects and population-averaged estimators?

Title		Comparing RE and PA models
Author		William Sribney, StataCorp

Random-effects estimators (or other cluster-specific estimators) fit the model

        Pr(Y_ij=1 | X_ij, u_i) = F(X_ij b + u_i)

whereas population-average estimators fit the model:

        Pr(Y_ij=1 | X_ij) = G(X_ij b*)

The subtle point is that b and b* are different population parameters. Hence, the estimators are estimating different things. In practice, however, b and b* are often very close.

The population-averaged model does NOT fully specify the distribution of the population. The cluster-specific model DOES fully specify the distribution (u_i is either given a distribution—i.e., a random-effects model—or is considered fixed like X_ij—i.e., a fixed-effects model). The population-averaged model specifies only a marginal distribution. Hence, the term “marginal” is often used for GEE estimates.

The subtle difference between b and b* is best explained with an example.

An example with logit

Suppose that you are looking at

        Outcome   Y_ij: employment/unemployment
    
        Predictor X_ij: married/unmarried

Then, under the cluster-specific model

        logit Pr(Y_ij=1 | X_ij, u_i) = a + X_ij b + u_i

the odds ratio

                 Pr(Y_ij=1 | X_ij=1, u_i)/Pr(Y_ij=0 | X_ij=1, u_i)
        OR_cs =  -------------------------------------------- = exp(b)
                 Pr(Y_ij=1 | X_ij=0, u_i)/Pr(Y_ij=0 | X_ij=0, u_i)

represents the odds of the person being employed if married compared with the odds of the SAME person being employed if not married.

Under the population-averaged model

        logit Pr(Y_ij=1 | X_ij) = a + X_ij b*

the odds ratio

                 Pr(Y_ij=1 | X_ij=1)/Pr(Y_ij=0 | X_ij=1)
        OR_pa =  ------------------------------------ = exp(b*)
                 Pr(Y_ij=1 | X_ij=0)/Pr(Y_ij=0 | X_ij=0)

represents the odds of an AVERAGE married person being employed compared with the odds of an AVERAGE unmarried person being employed.

Rather than saying “AVERAGE”, sometimes I speak loosely and say the odds of a married person “picked at random” being employed compared with the odds of another unmarried person “picked at random” being employed.

Let me now show that b and b* are, in general, different population parameters.

Here is my definition of the population DISTRIBUTION. (It is NOT a dataset.) The total population consists of five subjects:

        subject i   j     X_ij    u_i     Z_ij    Pr_cs   Pr_pa
        ---------  ---   ----  ----   -----   ------   ------
            1       1     0    -0.2   -0.10   0.4750   0.5249
            1       2     1    -0.2    0.50   0.6225   0.6674
            2       1     0    -0.1   -0.00   0.5000   0.5249
            2       2     1    -0.1    0.60   0.6457   0.6674
            3       1     0     0.0    0.10   0.5250   0.5249
            3       2     1     0.0    0.70   0.6682   0.6674
            4       1     0     0.1    0.20   0.5498   0.5249
            4       2     1     0.1    0.80   0.6900   0.6674
            5       1     0     0.2    0.30   0.5744   0.5249
            5       2     1     0.2    0.90   0.7109   0.6674

Here Z_ij = a + b*X_ij + u_i, with a = 0.1, b = 0.6, and u_i as given.

The cluster-specific probability Pr_cs is given by

        Pr_cs = exp(Z_ij)/(1 + exp(Z_ij))

For this population, the population-averaged probability, Pr_pa, is simply the average of Pr_cs for each X_ij. That is,

Pr_pa(X_ij=1)

=

(1/5) *

`Σ`	`5`
	`i=1`

Pr_cs

(x_ij=1)

= (1/5) * (0.6225 + 0.6457 + 0.6682 + 0.6900 + 0.7109)

= 0.6674

Cluster-specific odds ratio = exp(b) = exp(0.6) = 1.8221.

This is, of course, the same as the odds ratios computed within subject:

        Subject 1:  (0.6225/(1 - 0.6225))/(0.4750/(1 - 0.4750)) = 1.8221
        
        Subject 2:  (0.6457/(1 - 0.6457))/(0.5000/(1 - 0.5000)) = 1.8221
        
        Subject 3:  (0.6682/(1 - 0.6682))/(0.5250/(1 - 0.5250)) = 1.8221
        
        Subject 4:  (0.6900/(1 - 0.6900))/(0.5498/(1 - 0.5498)) = 1.8221
        
        Subject 5:  (0.7109/(1 - 0.7109))/(0.5744/(1 - 0.5744)) = 1.8221

Population-averaged odds ratio is

        exp(b*) = (0.6674/(1 - 0.6674))/(0.5249/(1 - 0.5249)) = 1.8169

Solving for b* gives

        b* = 0.5972

so b* is closer to the null, as the theory predicts (see the Neuhaus papers).

b and b* above are the TRUE population parameters, not estimates.

If we had a dataset consisting of a sample from this population distribution, and we used xtgee on this dataset (with the logit link and binomial distribution), xtgee would be estimating b*. If we used regular logit, we would also be estimating b* (one would want to specify the vce(cluster clustvar) option to correct the standard errors in this case).

If we used clogit on this dataset or a random-effects logit estimator, (one that assumes normally distributed u_i), we would be estimating b.

(Aside: The random-effects logit estimator described in the Neuhaus papers assumes a distribution for u_i different from that of the random-effects logit estimator implemented in Stata. My theory discussion here assumes one is using the “correct” distribution of u_i. I do not want to digress on this subject, but random-effects estimators that assume different distributions for u_i are technically different estimators; hence, there is more than one “random-effects logit estimator”.)

Here b and b* are almost the same number (b = 0.6 and b* = 0.5972), so it is easy to obscure the fact that the cluster-specific and population-averaged estimators are estimating different parameters. In other cases, the difference can be greater, so it is important to keep in mind which one you are estimating.

The bottom line for someone thinking about using the GEE estimator is to think about whether the averaging procedure makes sense for the type of inference you want to make. If you want to estimate how marriage makes a person get his act together and get a job (or else leave it to the spouse to bring home the groceries), then you want to go after b. If you want to look at employment for the average married person compared with the average unmarried person, then you want to go after b*.

Sometimes you might argue b* and b should be close, so the distinction is not worth making. But you had better be sure of your argument. Zero correlation (u_i=0) makes them the same; big Var(u_i) makes the difference greater.

References

Neuhaus J. M. 1992.: Statistical methods for longitudinal and clustered designs with binary responses. Statistical Methods in Medical Research 1: 249–273.

Neuhaus, J. M., J. D. Kalbfleisch, and W. W. Hauck. 1991.: A comparison of cluster-specific and population-averaged approaches for analyzing correlated binary data. International Statistical Review 59: 25–35.

What is the difference between random-effects and population-averaged estimators?

An example with logit

References

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

What is the difference between random-effects and population-averaged estimators?

An example with logit

References

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies