Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: ancova for repeated designs


From   Joseph Coveney <jcoveney@bigplanet.com>
To   Statalist <statalist@hsphsun2.harvard.edu>
Subject   st: Re: ancova for repeated designs
Date   Mon, 16 Aug 2004 23:25:16 +0900

tmmanini wrote:

. . . You are right that I have only 6 levels of convariate (a possible
problem), but I took your advice on several fronts and I'm still not fully
comprehending the solution.  Here's what I did: (I used my data with 32
subjects, which is included at the end).

First I ran the model positioning g after the id|g random error term, and
the specifying if t>1.  I got a sig. interaction, but according a recent
addition to the listserv, I learned that this interaction may not be as
important as I once thought. Therefore, I dropped the g*t term from the
model.

[excerpted ANOVA table with zero sum of squares and degrees of freedom for
continuous covariate]

These results seemed weird, based on the previous F value for x being much
higher.  So I dropped t>1 from the model

[excerpted ANOVA table with different results]

I'm not sure which model it correct?  Based on recent addition by Joseph
Coveney, the last model (without t>1) would be correct.

Here is the data, sorry it is long, there are 32 subjects, 3 levels of g, 3
levels of t and 1 level of x (remeber x is the first level of t (time)) I'm
trying to covary for the pre-test level (time==1).  One more thing, I
successfully implemented the adjust command by included id in the "by"
statement.  However, I only received adjustments for those subjects I
specify (ie. id<=4 gives me subjects 1 through 3), which makes sense.
However, I would like to report the adjusted mean for each group over each
time period.  I guess I can request all id's be shown on the output by using
"adjust x, by(g t id)" and then taking the mean of the id's for each group,
but that seems cumbersome.  Is there a better way? . . .

[dataset excerpted]

----------------------------------------------------------------------------

David Airey seems to have been on the right track a couple of posts ago.
The inconsistent estimates suggests that there is some kind of collinearity
between the groups-by-subject interaction term and the covariate, which
undermines estimation.  You can see it happening using -anova , sequential-
and stepwise shifting the position of x from first to after the id|g term.
You cannot do this with Stata's repeated-measures ANOVA syntax for
subjects-within-groups error term--you need to use alternative syntax and
just call it what it is, an interaction term.  This is illustrated below in
a do-file:  as the covariate enters the model going past the
groups-by-subjects interaction term in a sequential sums-of-squares ANOVA
(SAS Type I sums of squares), its sum of squares is zeroed.  Your dataset is
imbalanced, and this often induces this phenomenon to at least to some
extent in factorial ANOVA, but I didn't think that it wreaks this much
havoc, even with repeated-measures ANOVA.  The imbalance might be
compounding the effects of collinearity otherwise in the covariates and
factors.  What did SPSS give you, by the way?

At a loss as to what else to suggest, depending upon your objectives you can
try -xtreg, re-; transforming the covariate somehow (centering works for
some situations, like polynomials); perhaps breaking the analysis into two
(groups 1 and 2, groups 1 and 3, Bonferroni adjustment); or other avenues
that others on the list might suggest.

As far as dropping the term for interaction of group and continuous
covariate, assumption of homogeneity of slope is actually important.  When
the slope is substantially different between the groups, it complicates
interpretation and qualifies the conclusions.  It's just is difficult to
test for interaction powerfully in the average situation, at least it is for
interactions of categorical variables.

Don't drop the -if t > 1- from the command (model statement).  You seem to
have got confused by the Winer example--it didn't use the first time point's
response values as the covariate, so it didn't need to exclude the first
time point from analysis.  You do.  (In the do-file below, I pre-emptively
dropped the first observations, so the model statement doesn't anymore need
the -if t > 1-.)

You're right and I was mistaken as far as -adjust-:  subjects will reflect
their own intercepts and not the group average.  You could use -predict-.

Joseph Coveney

"share the within-subjects error term with the between-subjects
factor"--hope this clears up before next Monday.

clear
set more off
input byte id byte g byte t byte y byte x
[dataset excerpted--given in earlier post in this thread]
end
assert x == y if t == 1
drop if t == 1
* Stepwise shifting of entry position of x
anova y x g id*g t g*t, continuous(x) sequential
anova y g x id*g t g*t, continuous(x) sequential
anova y g id*g x t g*t, continuous(x) sequential
*
quietly anova y g / id|g x g*x t g*t, continuous(x)
predict y_hat, xb
predict y_res, residual
graph7 y_res g, xlabel ylabel yline(0)
graph7 y_res y_hat, xlabel ylabel yline(0)
drop y_*
* Using -predict- for within-cell x-adjusted predicted means
rename x x_prime
summarize x_prime, meanonly
generate float x = r(mean)
predict y_hat
bysort g t: summarize y_hat
*
drop x y_hat
rename x_prime x
reshape wide y, i(id) j(t)
manova y2 y3 = g x, continuous(x)
exit



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index