# st: Sample Size Calcs, Multiple testing and CI's

 From stephen.kay@adelphigroup.com To statalist@hsphsun2.harvard.edu Subject st: Sample Size Calcs, Multiple testing and CI's Date Wed, 25 Feb 2009 10:18:43 -0000

```I'm trying to come up with a sample size calculation for a proposed patient
study which has fourteen equally important endpoints - different quality of
life measures (assume all continuous). All endpoints involve the same
patients being statistically compared against published norm values (t
tests). Each of these norm values themselves have come from a different
study (14 in all - one for each providing norm mean, SD and Nnorm).

Once the study is finished I'll be asked to provide 95% CI's for mean
differences against norms for each of the fourteen endpoints.

I can't find any references that cover my problems:

Problem 1: If I adopt a family wise error rate approach, FWER, am I right in
thinking that my sample size calculation should focus on an "ANOVA like"
test
statistic that tests Ho: All mean differences are zero? If yes, how do I
form such a statistic? If no, can I compute a corrected individual P value
for each of the 14 tests using a recognised correction method (e.g. Sidak or
Holm) and base the sample size on the largest sample generated across the
fourteen that achieves 80% power for alpha = corrected P value?

Problem 2 (follows on from a "no" response in problem 1): On reading about
corrected P value threshold methods, one way of classifying them is by step
type (one-step  e.g. Bonferroni, step up - e.g. Hochberg and step up e.g.
Holm). I appreciate that Bonferroni is the most conservative and the other
methods are better to apply once the data is in. However the most stringent
P value generated by the step up and step down methods is often almost
identical to the one P value generated by Bonferroni and surely it is the
most stringent P value I am forced to use in the sample size calculations?
If this is so then I'm no better off using these methods than Bonferroni in
terms of study planning. Is this correct? I don't think I'm justified in
taking the average corrected P
value across the steps in an up or down procedure?

Problem 3: I'm at a bit of a loss concerning generating 95% CI's for
individual mean differences once the data is back in. Obviously it has to
correspond to the methods proposed in the sample size calculations. I'm
perplexed as to how I would calculate these for a step up or down P
correction method? I  assume for a one step method such as Bonferroni, I'd
just apply standard formula to the  adjusted critical P value (e.g. 0.05/14
for Bonferroni) to generate 95% CI's for the individual mean differences?

I realise I have not mentioned FDR methods - which I may well be forced to
adopt given the number of comparisons I'm forced to make.  I don't think
there's any point in performing an "ANOVA type" test if you are controlling
the FDR? Although every other difficulty mentioned above still applies? In
particular I'm really uncertain how to generate individual confidence
intervals for the mean differences under this approach?

Any help or relevant references that could solve my problems would be most
appreciated.

Stephen Kay
DISCLAIMER: The information in this message is confidential and may be
message by anyone else is unauthorised.  If you are not the intended
recipient, any disclosure, copying, or distribution of the message, or any
action or omission taken by you in reliance on it, is prohibited and may be