The formula for estimating the sample size based on the width of a confidence interval for a proportion is: n = (z^2 * p * q)/(d)^2, where z is the alpha level and d is the one-sided difference between p and the upper (or lower) limit. For example, if you expect a proportion of .35, and you want to be 95% sure that p is no larger than .40 (given that p is .35). So with z=1.96, p=.35, q=.65, and d=(.40 - .35) or .05, your sample size is about 350. Interestingly, you can get this using the -cii- command. .cii 350 .35, the difference though is that these confidence intervals are exact. In your case, I wouldn't use the normal approximation to the binomial because your proportion is quite rare. You could use -cii- and try different sample sizes, while maintaining the same proportion, e.g., .cii 1000 1, will give a wide confidence interval .cii 10000 10, will get you a narrower one. (Note these are exact limits, not normal approximations) What sampsi does is model both alpha error and beta error. As long as you don't specify a value for an alternative hypothesis (i.e., all you are interested in is interval estimation) you don't need to model beta error. Paul -----Original Message----- From: Don Spady [mailto:dspady@ualberta.ca] Sent: Monday, April 28, 2003 1:07 PM To: statalist@hsphsun2.harvard.edu Subject: st: Re: RE: Sample size Paul Thanks for your reply. Indeed I want to estimate prevalence, with the interval being from 0.0001 to 0.0003 or there abouts. I was told that the prevalence of the disease was between 1:200 and 1:2000, possibly closer to the 1:200. By shooting at 0.0005, I would get the worst case scenario. The confidence interval is hard to guess (say the real value is 1:200 and I test for 1:2000, how do I estimate a confidence interval) If the presence or absence follows a poisson distribution, then the variance is 1:2000 and the SD is 0.0224, I think. Does this make much sense. Don ----- Original Message ----- From: "VISINTAINER PAUL" <VISINT@NYMC.EDU> To: <statalist@hsphsun2.harvard.edu> Sent: Monday, April 28, 2003 10:09 Subject: st: RE: Sample size > Don, > > The problem you are having with sample size is that you haven't given enough > information. It isn't clear whether you want to simply estimate the > prevalence/incidence of a condition in the population; whether you want to > "test" whether the occurrence in the population is really .001, or whether > you want to test the difference between groups, assuming the occurrence in > general is .001. The last two options require you to specify an alternative > hypothesis, which you haven't given. > > Using your sampsi input, you are specifying a comparison between a > prevalence of 1 per 1000 vs. none (or a really very, very rare prevalence). > In this case you're specifying that the null value is .001 and your > alternative is that it is much more rare than that. If you reverse your > figures (e.g., sampsi 0 .001, p(.8)) you're specifying that the null value > is near 0 and your alternative hypothesis is that it is much more prevalent. > > > (I was actually surprised that sampsi performed the calculation with 0 as an > entry. I suppose it actually uses a very small value for 0.) > > For the first option, you rather just estimate the prevalence of this > condition, (which you think is pretty rare at .001), you might want to focus > on the precision of the estimate by specifying the width of the confidence > interval. I don't think we can get a sample size estimate based on the > width of a confidence interval using sampsi. > > So, what do you want to do? > > Paul > > > -----Original Message----- > From: Don Spady [mailto:dspady@ualberta.ca] > Sent: Monday, April 28, 2003 11:19 AM > To: Statalist > Subject: st: Sample size > > Dear all > I sent this before but got no response. I have revised it. > I want to estimate the sample size needed to detect an disease that > occurs in 1 out of 1000 people (as an example). The alternate > state is absence of disease which would occur in 999 of 1000 people on > average. The problem is that I get numbers but I don't know if they > are the > right ones. Can I use sampsi grp1 being those with disease and Grp2 > being > those without disease. Or do I use sampsi 0.001, onesample as in: > > sampsi 0.001 0, p(0.8) onesample > > I need help and thank in advance those that provide it. > > Donald Spady > Dep't of Pediatrics, University of Alberta > (780) 407-1244 > > Nature has no reset button. > > Donald Spady > Dep't of Pediatrics, University of Alberta > (780) 407-1244 > > Nature has no reset button. > > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

