Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Within and between variances


From   "David M. Drukker, Stata Corp" <[email protected]>
To   [email protected]
Subject   Re: st: Within and between variances
Date   Thu, 15 May 2003 11:16:14 -0500

Ineta Sokolowski <[email protected]> wrote:

> I wish, there were more details in manual.

Me too.  I will make some changes to this manual entry.

Ineta  continued with

> 
> I need the variance components to calculate the ICC (intra class
> correlation). I tried different methods to calculate ICC, including
> -loneway- and -xtsum-, but it gives very different result. 

-xtsum- and -loneway- are producing different summaries of your data.
-loneway- produces estimates of the variance components and the intraclass
correlation.  -xtsum- is simply summarizing the within and between
transformed variables.

If you are interested in estimating the ICC, use the results reported by
-loneway-, not -xtsum-.  

Ineta also asked for more details.  My previous posting provides the details
of what -xtsum- is doing and [R] loneway provides the details of what
-loneway- is doing.  Rather than repeat what is available elsewhere, I will
attempt to illustrate what -loneway- is doing by comparing it with another
other estimator of the ICC.

I will use an unbalanced longitudinal dataset on complaints by person over
time.  The dependent variable, complain, is binary.

. clear

. set mem 10m
(10240k)

. use http://www.stata-press.com/data/r8/chicken

Now let's use -loneway- to estimate the ICC.

. loneway complain person

                  One-way Analysis of Variance for complain: 

                                              Number of obs =      5952
                                                  R-squared =    0.2885

    Source                SS         df      MS            F     Prob > F
-------------------------------------------------------------------------
Between person         239.46604   1075     .2227591      1.84     0.0000
Within person          590.52976   4876    .12110947
-------------------------------------------------------------------------
Total                   829.9958   5951    .13947165

         Intraclass       Asy.        
         correlation      S.E.       [95% Conf. Interval]
         ------------------------------------------------
            0.13175     0.01204       0.10815     0.15535

         Estimated SD of person effect           .1355647
         Estimated SD within person              .3480079
         Est. reliability of a person mean        0.45632
              (evaluated at n=5.53)

-loneway- estimates the ICC without any controls.  Thus, another way of
estimating the ICC would be to estimate the parameters of random-effects
model without any covariates.

Recall that a random-effects model on a constant only is

     y_it = cons + u_i + e_it

where 
      y_it are the observations on the dependent variable;

      cons is a fixed constant to be estimated;

      u_i are the unobserved individual level effects,  

          the u_i are assumed to be identically and independently distributed
          (i.i.d.) over the individuals in the sample,

          E[u_i e_it] = 0; and,

      e_it are the unobserved idiosyncratic errors which are assumed to be
      i.i.d. over the entire sample.
	  

With unbalanced data, -xtreg- with the -sa- option will produce an estimate
of the ICC that is very similar to the one produced by -loneway- but not
exactly the same.  With balanced data, the estimates will be the same.
Until an update in the near future, -xtreg ,sa- does not run with a constant
only model.  However, we can fit the model by including a manually generated
constant, as in the output below.

. gen one = 1 

. xtreg complain one , i(person) sa 

Random-effects GLS regression                   Number of obs      =      5952
Group variable (i): person                      Number of groups   =      1076

R-sq:  within  = 0.0000                         Obs per group: min =         3
       between = 0.0000                                        avg =       5.5
       overall = 0.0000                                        max =         8

Random effects u_i ~ Gaussian                   Wald chi2(0)       =    738.70
corr(u_i, X)       = 0 (assumed)                Prob > chi2        =         .

------------------------------------------------------------------------------
    complain |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         one |   .1685169   .0062002    27.18   0.000     .1563646    .1806691
       _cons |  (dropped)
-------------+----------------------------------------------------------------
     sigma_u |  .13560089
     sigma_e |  .34800786
         rho |  .13181353   (fraction of variance due to u_i)
------------------------------------------------------------------------------

In -xt- terms, the SD of the person effect is the standard deviation of the
individual level effect and the SD within person is the standard deviation
of the idiosyncratic error.  -loneway- and -xtreg, sa- each have a
consistent estimator of the standard deviation of the individual level
effect, to use the -xt- terminology.  In fact, these estimators produce
exactly the same estimates from balanced data, but, since they use distinct
adjustments for unbalanced data, they produce slightly different estimates
from unbalanced datasets.  -loneway- and -xtreg , sa- are using the same
estimator of the standard deviation of idiosyncratic error.

Neither of these estimators explicitly takes account of the binary nature of
the dependent variable.

I hope that this helps.


	David
	[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index