Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Within and between variances


From   "David M. Drukker, Stata Corp" <[email protected]>
To   [email protected]
Subject   Re: st: Within and between variances
Date   Tue, 13 May 2003 14:57:12 -0500

Ineta Sokolowski <[email protected]> wrote: 

> Can anybody explain, why the one-way ANOVA (-loneway-) and -xtsum- gives
> different standard deviations for within and between effects of a
> variable? 
> 
> . loneway diagnosis gpnum
> . xtsum diagnosis, i(gpnum)
> 
> where "diagnosis" is a dichotomous variable (0=no illness, 1=illness)
> and "gpnum" is the general practitioners (GP) number (38 different
> numbers). Each GP has different number of patients (between 44 and 111).
> 
> How are the SD calculated in each procedure?
> 

-xtsum- and -loneway- provide different summaries of the data.  -xtsum- is
summarizing the overall variable, the between transformed variable and the
within transformed variable.  The reported standard deviations are the
estimated standard deviations for the transformed variables.  In contrast,
-loneway- provides a one-way analysis of variance decomposition of the
specified variable.  The formula for computing these standard deviations are
standard in the ANOVA literature and documented in [R] loneway.  It is
interesting to note that the reported standard deviations correspond to the
variance components in a constant only model.

Let's consider the case of -xtsum- in more detail.  Since the manual does not
go into great detail, I will.  Let 
                    
		              _
                        _     _
     ytilde_it = y_it - y_i + y

be the within transformed variable, 
	
       where 
	      y_it are the observations on the specified 
	      variable in group i at time t, 
	      
	      _ 
              y_i is the mean of y_it over the observations in group i,
       and 
              _
              _ 
              y is the overall mean of y.


The reported within standard deviation is the estimated standard deviation
of ytilde.

For the between model, the reported standard deviation is the estimated
                                        _
standard deviation of the n group means y_i. 

Since the formulas for computing the standard deviations reported by
-loneway- are given in [R] loneway, I will not repeat them here.  Still, for
those who think in -xt- terms, it is interesting to note that the between
standard deviation is an estimate of the standard deviation of the
individual level effect in a random-effects models in which the only
regressor is a constant.  Furthermore, the reported within standard
deviation is an estimate of the standard deviation of the idiosyncratic
error in a random-effects model in which the only regressor is a constant.

I hope that this helps.

     David
     [email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index