[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Seed, Paul" <paul.seed@kcl.ac.uk> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Detection of disease |

Date |
Fri, 15 Aug 2008 14:57:55 +0100 |

Carlo George poses an interesting problem. To deal with a few epidemiological issues first - "...to be 95% certain that the population is free from disease," he must assume - that the disease level is either 20% [Null hypothesis: H0] or 0% [Alternative hypothesis Ha], (a lower rate might well be missed.) - that the sample is representative of the population (a local outbreak outside the sampling area would certainly be missed) - that the test used is 100% sensitive A more realistic goal might be "...to be 95% certain that the test-positive rate is less than 20% in the population represented by the sample." He is interested in a onesided test at the 95% level, as probabilities < 0 have no meaning; so the standard Stata command is sampsi .2 0 , onesample onesided This gives n=11 (not 16), which is still different from the n=14 from the freeware package "Winepiscope" that Carlo uses. The reason is that -sampsi- uses Normal approximations for percentages, which tend to give smaller values than exact tests. To replicate Carlo's result, another approach is needed. This is made much easier by the fact that the disease level is 0% under Ha, so no events are expected. We can perform both tests in Stata; using -bitesti- for the exact test & -prtesti- for the Normal approximation (or Chi-sq test). foreach n of numlist 10/15 { bitesti `n' 0 prtesti `n' 0 } Concentrating on the onesided p-values (Ha: p < 0.2), it is clear that 14 subjects is the smallest number to give a significant test by the exact test; and 11 by the Normal approximation. The first figure confirms the Winepiscope result. An added level of sophistication is to look at the confidence intervals. Stata offers several: Wald (a version of the Normal approximation), "exact" (Clopper-Pearson), Wilson, Agresti-Coull, Jeffreys. 90% CI are needed to give a one-sided 95% interval. Both the Wald & Jeffreys intervals perform poorly in this case; but Wilson, "exact" and Agresti-Coull are worth considering. In particular, the Wilson interval seems to fit with the results of -prtestri-, which may be of interest, as there are arguments that the "exact" test is in fact over-conservative (hence the quotation marks). (I could dig out the references if anyone's interested. cii 14 0 , exact level(90) -- Binomial Exact -- Variable | Obs Mean Std. Err. [90% Conf. Interval] -------------+--------------------------------------------------------------- | 14 0 0 0 .1926362* (*) one-sided, 95% confidence interval cii 14 0 , wald level(90) -- Binomial Wald --- Variable | Obs Mean Std. Err. [90% Conf. Interval] -------------+--------------------------------------------------------------- | 14 0 0 0 0 cii 14 0 , wilson level(90) ------ Wilson ------ Variable | Obs Mean Std. Err. [90% Conf. Interval] -------------+--------------------------------------------------------------- | 14 0 0 0 .1619548 cii 14 0 , agresti level(90) -- Agresti-Coull --- Variable | Obs Mean Std. Err. [90% Conf. Interval] -------------+--------------------------------------------------------------- | 14 0 0 0 .1907622 The Agresti-Coull interval was clipped at the lower endpoint. cii 14 0 , jeffreys level(90) ----- Jeffreys ----- Variable | Obs Mean Std. Err. [90% Conf. Interval] -------------+--------------------------------------------------------------- | 14 0 0 0 .1260576 Date: Thu, 14 Aug 2008 11:53:33 +0200 From: "Carlo Georges" <georgesc@pt.lu> Subject: st: Detection of disease I tried to reproduce in stata the calculation needed for the following case: I need to determine the sample size, required to detct the presence of disease in a population. The formula is rather complex so it is difficult to paste in here, For example i need to detect with 95% confidence the abscence of disease in a population where the presumed prevalence would be 20%. How lrge a sample size do I need to be 95% certain that the population is free from disease. I used a program "Winepiscope" freeware, that calculated a samplesize of 14. in stata i tried : sampsi 0.2 0, power(0.9) onesample and I get a result of :16 Can stata handle this type of calculation? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: RE: st: Programming an iterative regression with converging parameters** - Next by Date:
**st: generating random numbers from a specified list** - Previous by thread:
**st: Bootstrap and Technical analysis** - Next by thread:
**st: generating random numbers from a specified list** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |