Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: re: ANCOVA for pre post designs

From   David Airey <>
Subject   st: re: ANCOVA for pre post designs
Date   Tue, 23 Dec 2003 21:01:38 -0600

In a description of what prepulse inhibition of startle was, I meant to say *MORE* where I said less,

In an area of schizophrenia research, subjects show a deficit in basic sensorimotor gating, as measured by prepulse inhibition of the acoustic startle response. The startle response is simply startle to a loud noise. Prepulse inhibition is simply inhibition of that startle response by preceding the loud noise with a soft noise. In both non-schizophrenics and schizophrenics, startle is comparable, but prepulse inhibition is _less_ in schizophrenics. That is, both groups startle comparably to a loud noise, but schizophrenics startle *MORE* when a startling noise is preceded by a soft noise. So, there are two brain circuits underlying this behavior and the prepulse inhibition circuit is compromised in schizophrenics.
Constantine Daskalakis replied to the original email,

This is a question for the biostatisticians on the list.

I'm thinking of formulating a commentary on accepted research procedures in my area that I think could be improved by observing basic statistical arguments presented to researchers by biostatisticians.

It has been suggested that in a randomized clinical trial design with baseline (B) and followup (F) test measures comparing a control and treatment group (G), performing an ANOVA on the ratio pre/post is the worst choice of the 4 ways to deal with baseline differences:

(1) post: analyze F by G
(2) difference: analyze F-B by G
(3) ratio: analyze F/B by G
(4) ancova: analyze F = constant + b1*B + b2*G, for G differences

In light of biostatisticians' suggestion (e.g., Vickers, BMC Medical Research Methodology (2001) 1:6, that method (4) above is preferred most and method (3) is least preferred, does it apply to "prepulse inhibition" literature?

In large trials, (1) should be fine (at least, in terms of no bias). But (2) or (4) may be more efficient.

(3) above is similar in flavor to (2) if you view it on the log scale, i.e.,

(logF-logB) by G (or, equivalently, log(F/B) by G).

A technical question is whether the original measurements (B and F), or their difference on the original scale, or their log-ratio (ie, difference of logs) more closely conforms to the assumptions of linear regression (normality of residuals, homoskedasticity).
Actually, in my data a square root or log transformation makes the raw data more normal so I'll think about this.

Still, I wouldn't do it on (F/B) but rather on log(F/B) if that looks good.
Why wouldn't you do it?

There is a difference in the underlying scientific model and interpretation, of course.

Does the treatment work additively (ie, adds a fixed amount, no matter where you start)? If so, the difference (F-B) would be a good choice (constant additive treatment effect across all values of B). And you'll be talking about the (arithmetic) mean difference for treatment vs. control.

But if the treatment works multiplicatively (ie, increases/decreases your original B measurement by a certain percent), then log(F-B) would be better. And then, by exponentiating the regression coefficients etc, you'll be talking about geometric mean ratio for treatment vs. control.
Thanks for these points.

Finally, the choice between (2) and (4) depends on the correlation between baseline and follow-up measurements. I think that when corr(B,F) < 0.5, then (4) turns out to be more efficient; otherwise, (2) is better. I believe there's a paper by Liang & Zeger on this.
The paper I cited was a quick power analysis of the 4 approaches. ANCOVA is always more efficient. Difference is more efficient than followup only when corr(B,F) is higher. The F/B ratio is also mentioned to be very sensitive to changes in the baseline distribution--power declines when variance in B increases.

OK, let me ask a simpler question: can one have baseline covariates in within-subjects ANOVAs like we have in ANCOVAs, which are between-subject ANOVAs but with covariates?

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index