Why don’t the decomposed variances in xtsum add up?
|
Title
|
|
Interpreting the within and between variances in
xtsum
|
|
Author
|
James Hardin, StataCorp
|
|
Date
|
November 1996; minor revisions July 2011
|
The within and between variances may not sum in the way that you expect
for two reasons:
- The reported variance estimates are the biased-corrected variance
estimates (they are multiplied by n/(n−1); the square root of
that for the printed standard deviations).
- The data are unbalanced with the results being that the overall mean is
different from the mean of the panel means.
For unbalanced data, the between variance is calculated using the mean of
the panel means. This may be different from the overall mean. The overall
mean can be calculated as a weighted mean of the panel means where the
weights are given by the number of observations in the panel. The mean of
the panel means is unweighted (or all weights equal to one if you like).
For balanced data, the only difference is the n/(n−1) factor where the
overall uses n=total number of observations and the between uses n=number of
panels. For illustration, look at the following example for weakly balanced data.
. use http://www.stata-press.com/data/r12/nlswork
(National Longitudinal Survey. Young Women 14-26 years of age in 1968)
. by idcode:keep if _N==10
(25834 observations deleted)
. xtsum birth_yr
Variable | Mean Std. Dev. Min Max | Observations
-----------------+--------------------------------------------+----------------
birth_yr overall | 48.4963 3.091477 42 53 | N = 2700
between | 3.096644 42 53 | n = 270
within | 0 48.4963 48.4963 | T = 10
. display 3.091477*sqrt(2699/2700)
3.0909045
. display 3.096644*sqrt(269/270)
3.0909042
|