# st: RE: Re: RE: Sample size

 From VISINTAINER PAUL To "'statalist@hsphsun2.harvard.edu'" Subject st: RE: Re: RE: Sample size Date Mon, 28 Apr 2003 13:35:52 -0400

```The formula for estimating the sample size based on the width of a
confidence interval for a proportion is:

n = (z^2 * p * q)/(d)^2, where z is the alpha level and d is the one-sided
difference between p and the upper (or lower) limit.  For example, if you
expect a proportion of .35, and you want to be 95% sure that p is no larger
than .40 (given that p is .35).  So with z=1.96, p=.35, q=.65, and d=(.40 -

Interestingly, you can get this using the -cii- command.

.cii 350 .35, the difference though is that these confidence intervals are
exact.

In your case, I wouldn't use the normal approximation to the binomial
because your proportion is quite rare.  You could use -cii- and try
different sample sizes, while maintaining the same proportion, e.g.,

.cii 1000 1,  will give a wide confidence interval

.cii 10000 10, will get you a narrower one.

(Note these are exact limits, not normal approximations)

What sampsi does is model both alpha error and beta error.  As long as you
don't specify a value for an alternative hypothesis (i.e., all you are
interested in is interval estimation) you don't need to model beta error.

Paul

-----Original Message-----
Sent: Monday, April 28, 2003 1:07 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: Re: RE: Sample size

Paul
the interval being from 0.0001 to 0.0003 or there abouts.  I was told
that the prevalence of the disease was between 1:200 and 1:2000,
possibly closer to the 1:200.  By shooting at 0.0005, I would get the
worst case scenario.  The confidence interval is hard to guess (say the
real value is 1:200 and I test for 1:2000, how do I estimate a
confidence interval)  If the presence or absence follows a poisson
distribution, then the variance is 1:2000 and the SD is 0.0224, I think.
Does this make much sense.

Don

----- Original Message -----
From: "VISINTAINER PAUL" <VISINT@NYMC.EDU>
To: <statalist@hsphsun2.harvard.edu>
Sent: Monday, April 28, 2003 10:09
Subject: st: RE: Sample size

> Don,
>
> The problem you are having with sample size is that you haven't given
enough
> information.  It isn't clear whether you want to simply estimate the
> prevalence/incidence of a condition in the population; whether you
want to
> "test" whether the occurrence in the population is really .001, or
whether
> you want to test the difference between groups, assuming the
occurrence in
> general is .001.  The last two options require you to specify an
alternative
> hypothesis, which you haven't given.
>
> Using your sampsi input, you are specifying a comparison between a
> prevalence of 1 per 1000 vs. none (or a really very, very rare
prevalence).
> In this case you're specifying that the null value is .001 and your
> alternative is that it is much more rare than that.  If you reverse
your
> figures (e.g., sampsi 0 .001, p(.8)) you're specifying that the null
value
> is near 0 and your alternative hypothesis is that it is much more
prevalent.
>
>
> (I was actually surprised that sampsi performed the calculation with 0
as an
> entry.  I suppose it actually uses a very small value for 0.)
>
> For the first option, you rather just estimate the prevalence of this
> condition, (which you think is pretty rare at .001), you might want to
focus
> on the precision of the estimate by specifying the width of the
confidence
> interval.  I don't think we can get a sample size estimate based on
the
> width of a confidence interval using sampsi.
>
> So, what do you want to do?
>
> Paul
>
>
> -----Original Message-----
> Sent: Monday, April 28, 2003 11:19 AM
> To: Statalist
> Subject: st: Sample size
>
> Dear all
>   I sent this before but got no response.  I have revised it.
> I want to estimate the sample size needed to detect an disease that
> occurs in 1 out of 1000 people (as an example).   The alternate
> state is absence of disease which would occur in 999 of 1000 people on
> average.   The problem is that I get numbers but I don't know if they
> are the
> right ones.  Can I use sampsi grp1 being those with disease and Grp2
> being
> those without disease.  Or do I use sampsi   0.001, onesample as in:
>
> sampsi 0.001 0, p(0.8) onesample
>
> I need help and thank in advance those that provide it.
>
> Dep't of Pediatrics, University of Alberta
> (780) 407-1244
>
> Nature has no reset button.
>
> Dep't of Pediatrics, University of Alberta
> (780) 407-1244
>
> Nature has no reset button.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```