# st: Re: ancova for repeated designs

 From Paul Seed To statalist@hsphsun2.harvard.edu Subject st: Re: ancova for repeated designs Date Fri, 20 Aug 2004 16:45:07 +0100

I personally would tackle the problem using a form of regression with
dummy variables, rather than the -anova- command. This has several advantages:
It gives me more control over what is being estimated
It presents the results as estimates with CI, rather than as general tests
Using the robust option, particularly in conjunction with - xtgee-
controls for non-sphericity, heterogeneity etc, rather than just testing for a problem.

Starting with the original data set, a possible series of commands might be:

set matsize 500

* Check that x is indeed equal to y[1]
bys id (t): assert x[1] == x
bys id (t): assert y[1] == x

* This being so, the first time period can go.

* Set up dummy variables

* main effects of the groups
for num 1/3: gen gX = g == X if g ~= .

* main effects of time
* (not strictly needed, as only 2 time points in the model)
for num 1/3: gen tX = t == X if t ~= .

* treatment-time interaction
* (assuming time effect linear, with zero effect at time 1)
for num 1/3: gen gX_t = gX*(t-1)

* Set up the -xt- structure
iis id
tis t

* Produce some estimates & graphs
version 7: qnorm y
tab y
* Clearly not Normal, so we may need to use something more subtle, such as
* interval regression or ordered logistic regression.
* However, that is not the immediate problem.

xtgraph y , group(g) xlab( 1 2 3) ylab offset(.05) list

* Pretty the graph up using version 8 graphics
preserve
tempfile myfile
xtgraph y , group(g) list savdat("`myfile'")
/* Insert gr8 commands to taste */
restore

* Now the main analysis

* Constant effects of treatment
regress y x g2 g3 t if t > 1, cluster(id)

* Treatment effect increasing linearly with time
regress y x g2_t g3_t t if t > 1, cluster(id)

* Both models suggest that group 2 typically has lower scores.
* However the first model seems to fit the data better.

In practice, I would probably stop there & report the
constant effects model; however, xt models can also be fitted.
e.g.
xtgls y x g2 g3 t if t > 1, corr(ar1)
xtgee y x g2 g3 t if t > 1, cluster(id)
etc.

I would be interested in comments on which
approach is most appropriate for a data set such as this

=========================

Paul T Seed (Paul.Seed@kcl.ac.uk)
Division of Reproductive Health, Endocrinology and Development
Guy's Kings and St. Thomas' School of Medicine, King's College London,
St Thomas' Hospital,
London SE1 7EH

tel (+44) (0) 20 7188 3642
fax (+44) (0) 20 7620 1227
Thurs only: (+44) (0) 20 7848 4208

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/