From wgould@stata.com (William Gould) To statalist@hsphsun2.harvard.edu Subject Re: st: Experimental design - ANOVA/GLM? - please help Date Fri, 27 Sep 2002 08:04:00 -0500

```Ricardo Ovaldia <ovaldia@yahoo.com> wrote,

> I was approached by an investigator with the following problem. He had two
> groups of experimental rats, 10 diabetic and 10 non-diabetic. Each of these
> rats had one liter of 10 pups (average). On each of the pups a series of
> biochemicals were measured. He wants me to compare the mean value of these
> biochemicals from the pumps from diabetic moms to the pups from non-diabetic
> pups. He then suggested that I do a simple t-test comparing the means of the
> two pup groups. I pointed out that the observations are not independent
> because of several pups from the same liter and that the liter effect needs
> to be taken into account.
>
> How can I set this up in Stata?

and my colleague Ken Higbee <khigbee@stata.com> showed how to do the problem
using -anova-.

What follows is really a footnote.  I want to compare the results obtained
by Ken with those that would have been obtained using -regress, cluster-.

Ken generated a phony dataset and got an F statistic of 6.8.  With -regress,
cluster-, I got 6.5.

Ken thoughtfully included in his posting how he generated the phony data,
which allowed me to try a different approach.  I started with Ken's data:

clear
set obs 2
gen group = _n
expand 10
sort group
qui gen mom = _n in 1/10
qui replace mom = mom[_n-10] in 11/20
set seed 32981
gen z = 10 + round(uniform()*4-2,1)
expand z
drop z
bysort group mom : gen pup = _n
gen y = uniform()*8 + group
compress

which produces 190 observations on the variables

group        treatment group, 1 or 2
mom          mother id, 1, 2, ... 10
pup          pup id, 1, 2, ..., 12
y            outcome variables, continuous, [1.02, 9.77]

To obtain the ordinary t-test for the difference in means between
two groups, but using -regress-, one types

. regress y group

To obtain the test while relaxing the assumption that the observations
are independent within mother, one types

. regress y group, cluster(mom)

So here's the output:

==============================================================================
Regression with robust standard errors                 Number of obs =     190
F(  1,     9) =    6.44
Prob > F      =  0.0319
R-squared     =  0.0295
Number of clusters (mom) = 10                          Root MSE      =  2.3819

------------------------------------------------------------------------------
|               Robust
y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
group |   .8258395   .3255219     2.54   0.032     .0894579    1.562221
_cons |   5.209672   .1988641    26.20   0.000     4.759811    5.659534
------------------------------------------------------------------------------
==============================================================================

The t statistic for group is 2.54, so the corresponding F is (2.54)^2 = 6.5,
which compares well with the 6.8 reported by Ken.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```