# Re: st: power or sample size by survival vs. comparison of proportions

 From ymarchenko@stata.com (Yulia Marchenko, StataCorp LP) To "statalist" Subject Re: st: power or sample size by survival vs. comparison of proportions Date Sat, 21 Jun 2008 19:29:07 -0500

```"Rosy Reynolds" <rr@dandr.demon.co.uk> asks about differences in estimated
sample sizes obtained using the binomial distribution (-sampsi-) and Cox
proportional hazards model (-stpower cox-) for survival data:

> Suppose there are two groups of equal size in a (proportional hazards)
> survival study. I follow them up for such a time that overall 50% of
> participants die, and I am looking for a hazard ratio of 1.5. By the
> end of follow-up, therefore, 60% would die in the higher risk group
> and 40% in the lower risk group.

> If I was going to analyse simply by comparing the proportions who died
> in the two groups, I could estimate the number needed for 80% power,
> 5% significance, with
>
> . sampsi 0.6 0.4, power(0.8) alpha(0.05)
>
> If I was going to analyse by using Cox regression, I could estimate
> the  number with
>
> . stpower cox, hratio(1.5) power(0.8) alpha(0.05) failprob(0.5)
>
>
> -sampsi- estimates that I need 214 participants (107 in each group),
> while -stpower- estimates a need for 382 participants to observe
> 191 deaths.
> ...

Rosy used p1 = 0.6 and p2 = 0.4 as the group-specific proportions of
participants who are expected to die by the end of the study in the -sampsi-
command.  However, I believe that Rosy obtained these values using the formula
for a hazard ratio based on hazard rates (hr = h1/h2, 1.5 = 0.6/0.4) rather
than based on proportions surviving by the end of the study
(hr = ln(1-p1)/ln(1-p2)).  This produced unexpected results from the -sampsi-
and -stpower- commands.

Rosy either has a hazard ratio of 1.5 (in which case the proportions are not
0.6 and 0.4), or she has proportions of 0.6 and 0.4 (in which case the hazard
ratio is not 1.5).

In the first case, assuming a hazard ratio of 1.5 and an average death rate of
50%, Rosy would need to solve a nonlinear equation (1-p1) = (1-p2)^1.5 subject
to the constraint that (p1+p2)/2 = 0.5 to obtain the correct p1 = 0.57 and p2
= 0.43 for use in the -sampsi- command.  Using these values, the required
total sample size is 428 (214 per group) which is comparable to the sample
size of 382 Rosy obtained from -stpower cox-.

. sampsi 0.57 0.43, p(0.8)

Estimated sample size for two-sample comparison of proportions

Test Ho: p1 = p2, where p1 is the proportion in population 1
and p2 is the proportion in population 2
Assumptions:

alpha =   0.0500  (two-sided)
power =   0.8000
p1 =   0.5700
p2 =   0.4300
n2/n1 =   1.00

Estimated required sample sizes:

n1 =      214
n2 =      214

In the second case, Rosy can use a hazard ratio of 1.79 = ln(1-0.6)/ln(1-0.4)
with -stpower cox-.  For example, if we specify -hratio(1.79)- instead of
-hratio(1.5)- with -stpower cox-, we obtain a required sample size of 186 which
is comparable to 214 obtained by Rosy from -sampsi-.

. stpower cox, hratio(1.79) power(0.8) alpha(0.05) failprob(0.5)

Estimated sample size for Cox PH regression
Wald test, log-hazard metric
Ho: [b1, b2, ..., bp] = [0, b2, ..., bp]

Input parameters:

alpha =    0.0500  (two sided)
b1 =    0.5822
sd =    0.5000
power =    0.8000
Pr(event) =    0.5000

Estimated number of events and sample size:

E =        93
N =       186

-- Yulia
ymarchenko@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```