Re: st: Within and between variances

 From "David M. Drukker, Stata Corp" To statalist@hsphsun2.harvard.edu Subject Re: st: Within and between variances Date Thu, 15 May 2003 11:16:14 -0500

```Ineta Sokolowski <IS@ALM.AU.DK> wrote:

> I wish, there were more details in manual.

Me too.  I will make some changes to this manual entry.

Ineta  continued with

>
> I need the variance components to calculate the ICC (intra class
> correlation). I tried different methods to calculate ICC, including
> -loneway- and -xtsum-, but it gives very different result.

-xtsum- and -loneway- are producing different summaries of your data.
-loneway- produces estimates of the variance components and the intraclass
correlation.  -xtsum- is simply summarizing the within and between
transformed variables.

If you are interested in estimating the ICC, use the results reported by
-loneway-, not -xtsum-.

Ineta also asked for more details.  My previous posting provides the details
of what -xtsum- is doing and [R] loneway provides the details of what
-loneway- is doing.  Rather than repeat what is available elsewhere, I will
attempt to illustrate what -loneway- is doing by comparing it with another
other estimator of the ICC.

I will use an unbalanced longitudinal dataset on complaints by person over
time.  The dependent variable, complain, is binary.

. clear

. set mem 10m
(10240k)

. use http://www.stata-press.com/data/r8/chicken

Now let's use -loneway- to estimate the ICC.

. loneway complain person

One-way Analysis of Variance for complain:

Number of obs =      5952
R-squared =    0.2885

Source                SS         df      MS            F     Prob > F
-------------------------------------------------------------------------
Between person         239.46604   1075     .2227591      1.84     0.0000
Within person          590.52976   4876    .12110947
-------------------------------------------------------------------------
Total                   829.9958   5951    .13947165

Intraclass       Asy.
correlation      S.E.       [95% Conf. Interval]
------------------------------------------------
0.13175     0.01204       0.10815     0.15535

Estimated SD of person effect           .1355647
Estimated SD within person              .3480079
Est. reliability of a person mean        0.45632
(evaluated at n=5.53)

-loneway- estimates the ICC without any controls.  Thus, another way of
estimating the ICC would be to estimate the parameters of random-effects
model without any covariates.

Recall that a random-effects model on a constant only is

y_it = cons + u_i + e_it

where
y_it are the observations on the dependent variable;

cons is a fixed constant to be estimated;

u_i are the unobserved individual level effects,

the u_i are assumed to be identically and independently distributed
(i.i.d.) over the individuals in the sample,

E[u_i e_it] = 0; and,

e_it are the unobserved idiosyncratic errors which are assumed to be
i.i.d. over the entire sample.

With unbalanced data, -xtreg- with the -sa- option will produce an estimate
of the ICC that is very similar to the one produced by -loneway- but not
exactly the same.  With balanced data, the estimates will be the same.
Until an update in the near future, -xtreg ,sa- does not run with a constant
only model.  However, we can fit the model by including a manually generated
constant, as in the output below.

. gen one = 1

. xtreg complain one , i(person) sa

Random-effects GLS regression                   Number of obs      =      5952
Group variable (i): person                      Number of groups   =      1076

R-sq:  within  = 0.0000                         Obs per group: min =         3
between = 0.0000                                        avg =       5.5
overall = 0.0000                                        max =         8

Random effects u_i ~ Gaussian                   Wald chi2(0)       =    738.70
corr(u_i, X)       = 0 (assumed)                Prob > chi2        =         .

------------------------------------------------------------------------------
complain |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
one |   .1685169   .0062002    27.18   0.000     .1563646    .1806691
_cons |  (dropped)
-------------+----------------------------------------------------------------
sigma_u |  .13560089
sigma_e |  .34800786
rho |  .13181353   (fraction of variance due to u_i)
------------------------------------------------------------------------------

In -xt- terms, the SD of the person effect is the standard deviation of the
individual level effect and the SD within person is the standard deviation
of the idiosyncratic error.  -loneway- and -xtreg, sa- each have a
consistent estimator of the standard deviation of the individual level
effect, to use the -xt- terminology.  In fact, these estimators produce
exactly the same estimates from balanced data, but, since they use distinct
adjustments for unbalanced data, they produce slightly different estimates
from unbalanced datasets.  -loneway- and -xtreg , sa- are using the same
estimator of the standard deviation of idiosyncratic error.

Neither of these estimators explicitly takes account of the binary nature of
the dependent variable.

I hope that this helps.

David
--ddrukker@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```