Sample size calculation for means and proportions [STB-11: sg15] ------------------------------------------------- The syntax for ^sampsiz^ is: ^sampsiz alpha power null test, [m|pr] t([p|c]) s([1|2]) sd() sd1() sd2() r() where each is defines as follows: ^alpha^ = significance value (typically .01, .02, .05, .10) ^power^ = 1-beta (usually about 4alpha: .95, .90, .80, .75) ^null^ = null hypothesis (population) mean or (population) proportion ^test^ = test or alternative mean or proportion ^m|pr^ = Required: m = means test; pr = proportions test ^t(p|c)^ = Required: p = null is a population statistic; c = comparison test ^s(1|2)^ = 1-sided or 2-sided test. 2-sided is default. Use only for s(1). ^sd^ = required for m t(p) test: the population standard deviation ^sd1,sd2^= required for m t(c) test: the stan. dev. of each comparison means ^r()^ = multiple difference between t(c) test of unequal sample sizes ^sampsiz^ allows estimation of an appropriate sample size for tests of the difference between two means or two proportions. The null mean or proportion values may be a population statistic. Both 1-sided or 2-sided tests may be performed. Moreover, unequal sample sizes are accomodated. There are various formulae that have been used to calculate sample size. I have used the following: Means testing (Pagano, Rosner); proportions (Pagano, Fleiss) ^Examples: Proportions - population vs test [Pagano] ------------------------------------------ The true population proportion of prostrate cancer patients who are under 55 at the time of diagnosis and live for at least 4 years is .25. We wish to test a group of such patients who are using drug X in the course of their treatment. We think that the use of the drug will increase survival to .33. Using a .05 alpha and a power of .80, we run sampsiz as: ^. sampsiz .05 .80 .25 .33, pr t(p) Estimated Sample Size Computation Proportion Number of cases => 242 Z-alpha => 1.96 Z-power => 0.84 Proportions - comparison [Fleiss, Casagrande et al.] ---------------------------------------------------- We are interested in testing two treatments, one using a standard treatment and the other a new treatment. We hypothesize a remission rate of .65 for the former and a rate of .75 for the latter. 765 cases are required in each sample to guarantee a significance level of .01 and a power of .95. ^sampsiz .01 .95 .65 .75, pr t(c) Number of cases: Sample 1 => 765 Number of cases: Sample 2 => 765 Z-alpha => 2.58 Z-power => 1.64 Proportions - comparison with unequal sample sizes [Fleiss] ----------------------------------------------------------- Suppose that there is some opposition to using so many cases for the new treatment. If we will accept a new treatment sample that is half the size of the standard treatment sample, we have ^sampsiz .01 .95 .65 .75, pr t(c) r(.5) Number of cases: Sample 1 => 1138 Number of cases: Sample 2 => 569 Z-alpha => 2.58 Z-power => 1.64 Means - population vs test [Pagano] ----------------------------------- The true mean serum cholestrol level of U.S. males between the ages of 20 to 74 is 211mg/100ml with a standard deviation of 46mg/100ml. In designing an experiment to test whether a drug will significantly reduce cholestrol, we must specify a sample size that provides appropriate power. Suppose we wish to test whether the effect of the drug will result in a reduction of mean serum cholestrol level to 180mg/100ml. We set alpha at .01 and the power at .95 since we only want to risk a 5 percent chance of failing to reject the null hypothesis. Moreover, since we expect a reduction of level, we use a 1-sided test. ^sampsiz .01 .95 211 180, m t(p) sd(46) s(1) Number of cases => 35 Z-alpha => 2.33 Z-power => 1.64 Means - comparison [Rosner] --------------------------- We are doing a study of the relationship of oral contraceptives (OC) and blood pressure (BP) level for women ages 35-39. A pilot study is required in order to ascertain parameter estimates to plan a larger study. Assuming that the true BP is normal for both groups, the mean and standard deviation (SD) of OC users is 132.86 and 15.34 respectively. The mean and SD of OC non-users is found to be 127.44 and 18.23. For a larger equal sample-sized study, with a significance level of .05 with a power of .80, we need the following number of cases in each sample. ^sampsiz .05 .80 132.86 127.44, m t(c) sd1(15.34) sd2(18.23) Number of cases: Sample 1 => 152 Number of cases: Sample 2 => 152 Z-alpha => 1.96 Z-power => 0.84 Means - comparison with unequal sample sizes [Rosner] ----------------------------------------------------- Using the same example as above, suppose that we want twice the number of OC non-users as OC users in our larger study. ^sampsiz .05 .80 132.86 127.44, m t(c) sd1(15.34) sd2(18.23) r(.5) Number of cases: Sample 1 => 107 Number of cases: Sample 2 => 215 Z-alpha => 1.96 Z-power => 0.84 References ---------- Casagrande, J. T., M. C. Pike, and P. G. mith. 1978. The power function of the exact test for comparing two binomial distributions. ^Appl. Stat.^ 27: 176-180. Fleiss, J. 1981. ^Statistical methods for rates and proportions^. New York: Wiley & Sons. Pagano, M. and K. Gauvreau. 1993. ^Principles of biostatistics^. Belmont, CA: Duxbury/Wadsworth. Rosner, B. 1986. ^Fundamentals of biostatistics^. Boston: Duxbury Press. Additional help available from: Joseph Hilbe, Editor, STB 10952 N. 128th Pl., Scottsdale, AZ 85259-4464 Fax: 602-860-1446; Voice: 602-860-4331