Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: higher occurrence of disease X in rare disease Y

 From Nick Cox To "statalist@hsphsun2.harvard.edu" Subject Re: st: higher occurrence of disease X in rare disease Y Date Thu, 5 Dec 2013 14:50:44 +0000

```Doug's code is fine and suits the purpose perfectly.

A footnote is to underline that Mata can be used to the same "smart
calculator" end, e.g.

. mata
: X = (0..6)'
: binomialp(6,X,1/36)
1
+---------------+
1 |  .8444875698  |
2 |  .1447692977  |
3 |  .0103406641  |
4 |  .0003939301  |
5 |  8.44136e-06  |
6 |  9.64727e-08  |
7 |  4.59394e-10  |
+---------------+
: binomialp(6,(0..6)',1/36)
1
+---------------+
1 |  .8444875698  |
2 |  .1447692977  |
3 |  .0103406641  |
4 |  .0003939301  |
5 |  8.44136e-06  |
6 |  9.64727e-08  |
7 |  4.59394e-10  |
+---------------+

Nick
njcoxstata@gmail.com

On 5 December 2013 14:42, Doug Hemken <dehemken@wisc.edu> wrote:
> Here is a script that illustrates the probabilities using Stata, it is such a small problem that you can illustrate statistical power by trail-and-error:
>
> set obs 7
> gen X = _n - 1
> gen prob = binomialp(6,X,1/36)
> gen prob10 = binomialp(10,X,1/36)
> gen prob12 = binomialp(12,2*X,1/36)
>
>
> On 12/05/13, Doug Hemken  wrote:
>> If your sample size is literally six cases, then your unconditional probability of seeing disease X is 0.145. If there is no relation between Y and X, it wouldn't be too unusual to see 1 case of X crop up in 6 cases of Y. This is from a binomial distribution.
>>
>> On 12/05/13, "tiong21@netzero.net" wrote:
>> > The prevalence of disease (X) is 1 in 36 in the general population. In a sample population with a very rare disease (Y) of unknown etiology, the prevalence of disease X is 1 in 6 ( ie: 1 case of X was found in the sample population of 6 rare cases of disease Y. How do I show statistically that this higher occurrence of disease X in rare disease Y is not due to chance? And as a corollary suggest that disease X may be a contributory factor in the etiology of disease Y (an issue of causality). Furthermore, should a Poisson distribution be used to calculate the probabilities? A sample Stata script will be much appreciated.
>> >
>> >
>> > Tiong The
>> > tiong21@netzero.net
>> >
>> > *
>> > * For searches and help try:
>> > * http://www.stata.com/help.cgi?search
>> > * http://www.stata.com/support/faqs/resources/statalist-faq/
>> > * http://www.ats.ucla.edu/stat/stata/
>>
>> --
>> Doug Hemken
>> 4226I Social Science Bldg.
>>
>> dehemken@wisc.edu
>> 262-4327
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> --
> Doug Hemken
> 4226I Social Science Bldg.
>
> dehemken@wisc.edu
> 262-4327
>
> To make a consulting appointment send me an email, or use the on-line scheduler:
> https://calendar.wisc.edu/scheduling-assistant/public/profiles/PlRxCykH.html
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```