
Title | Interpreting the within and between variances in xtsum | |
Author | James Hardin, StataCorp |
The within and between variances may not sum in the way that you expect for two reasons:
For unbalanced data, the between variance is calculated using the mean of the panel means. This may be different from the overall mean. The overall mean can be calculated as a weighted mean of the panel means where the weights are given by the number of observations in the panel. The mean of the panel means is unweighted (or all weights equal to one if you like).
For balanced data, the only difference is the n/(n−1) factor where the overall uses n=total number of observations and the between uses n=number of panels. For illustration, look at the following example for weakly balanced data.
Variable | Mean Std. Dev. Min Max | Observations | ||
birth_yr overall | 48.4963 3.091477 42 53 | N = 2700 | ||
between | 3.096644 42 53 | n = 270 | ||
within | 0 48.4963 48.4963 | T = 10 |