# Re: st: RE: binary proportions and sample size

 From nicola.baldini2@unibo.it To statalist@hsphsun2.harvard.edu Subject Re: st: RE: binary proportions and sample size Date Mon, 06 Apr 2009 18:18:00 +0200

```and the winner is... Paul, who best guessed what I had in mind :)
Many thanks to Martin and Austin, too.
Let me summarize. Paul says that the sample suggests that the true number of problem tools is between 13 (1.5% x 900) and 119 (13.2% x 900), with 95% certainty. And that I can have an higher precision, the more the sample size is. Let me rephrase for Austin: which is the chance that the true number of problem tools is exactly 49 (5.4% x 900)? (And one more question: which is the chance if I relax a little my request, and allow the true number to be between 48 and 50?)
I am trying to answer such a question following Martin, but the one sample -prtesti- doesn't take advantage of the information that the population size is 900, while the two sample -prtesti- says that the chance is 1 (prtesti 74 .054 900 .054 produces Ha: diff != 0  p = 1)

Nicola

At 02.33 27/03/2009 -0400, you wrote:
>Nicola,
>
>You can't do a "statistical test", yet, because you haven't specified what your "expectation" (ie., null hypothesis) is.  What you can do is determine the degree of precision you have for your estimate, by computing the confidence interval.
>
>.cii 74 4
>
>This gives your estimate of the proportion of problem tools as 5.4% with a 95% confidence interval of 1.5% to 13.2%.  From a practical point of view, this means that if you were to review all 900 tools, the true proportion of problem tools is likely to fall somewhere between 1.5% and 13.2% (with 95% certainty).  If your sample of 74 was truly random, then 5.4% is a good guess of what you find in the population.
>
> From the perspective of confidence intervals, whether or not your sample of 74 is large enough depends on how much precision you want for your estimate.  If the interval from 1.5% to 13.2% is too wide, a larger sample would produce a narrower interval, and this can be computed.
>
>To answer the question, "Does the sample proportion differ from the population proportion", requires you to specify a value to test it against.  Perhaps, though, your confidence interval can provide you the information you need.  From the output you have, you can say, with 95% confidence, that it is quite unlikely that population proportion is larger than, say 14% (assuming that you have a random sample), because 14% lies outside the 95% confidence interval.
>
>(There is something called the "finite population correction" factor, which I think you can ignore in this case.  The FPC provides an adjustment which reduces the estimation of variance when the sample is "large" relative to the population.  As I recall, I think it really only makes a difference if the sample is more than 10% of the population. If I'm wrong on this, someone please correct me.)
>
>
>- -p
>
>
>
>_____________________
>Paul F. Visintainer, PhD
>Baystate Health System
>280 Chestnut Street
>Springfield, MA 01199
>paul.visintainer@bhs.org
>
>
>- -----Original Message-----
>From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of nicola.baldini2@unibo.it
>Sent: Wednesday, March 25, 2009 2:05 PM
>To: statalist@hsphsun2.harvard.edu
>Subject: st: binary proportions and sample size
>
>This is stupid question stemming from a practical problem. The problem sounds easy and familiar, but I don't have a statistical manual at hand and no idea about the keywords to refine a search for the solution on Statalist.
>I took a (let's say random) sample of 74 tools from a popolation of 900. I checked personally each of the 74 tools, and found that 4 of them have some problems (5,4%) and 70 are ok. Can I expect to find 49 (5,4% x 900) problems in the population (i.e. is the sample size big enough to say so) and can I attach a p-value to my expectations (i.e. which is the probability that the problem-rate in the sample and in the population are different - statistically significantly different)? (And, obviously, which Stata command will answer to my questions???)