Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Hypergeometric Distribution


From   "Stas Kolenikov" <[email protected]>
To   [email protected]
Subject   Re: st: RE: Hypergeometric Distribution
Date   Thu, 23 Aug 2007 19:11:55 -0500

I know that the following was a breakthrough paper speeding up
hypergeometric computations quite a bit:
http://math.mit.edu/~plamen/files/hyper.pdf. Marcello, this guy is in
your geographic area, so you can get him out for a coffee or something
to see if there are any fast algorithms to work out in [St|M]ata.

On 8/23/07, Mike Lacy <[email protected]> wrote:
>  >Date: Wed, 22 Aug 2007 15:32:38 +0100
>  >From: "Newson, Roger B" <[email protected]>
>  >Subject: st: RE: Hypergeometric Distribution
>  >
>  >Thanks to Marcello for telling us all about this recently-published
>  >algorithm, which looks very useful. A search on
>  >
>  >findit hypergeometric
>  >
>  >in Stata finds a single reference (to a SSC package), which was
>  >distributed as long ago as 1999. This suggests that the new algorithm
>  >might be a good candidate for implementation in Mata by Marcello, or by
>  >anybody else with the time and inclination to do so.
>
> I have something similar and could use a collaborator:
>
> I have a Stata program to calculate hypergeometric probabilities
> using the algorithm of:
>
> Berry, K. J.,  & Mielke, P. W. (1983). A rapid FORTRAN
> subroutine  for the Fisher exact probability test.  Educational and
> Psychological
> Measurement,43, 167-171.
>
> Their algorithm exploits a recursion, and so avoids the calculation
> of any factorials or log factorials in calculating the
> hypergeometric.  I suspect it is faster than even the newly published
> algorithm, although I don't know. It is particularly suited to
> applications in which the entire vector of probabilities across the
> range of the variable is needed (e.g., Fisher's Exact), since it has
> to calculate all the probabilities to get just one of them.  However,
> it can do all the probabilities for a variable with what I believe is
> lower O() complexity than a conventional algorithm would calculate
> any single one.
>
> I am not a "production quality" Stata programmer, and don't want to
> take the time to be one, so if anyone else is interested, I'd be
> happy to send them my code to be dressed up for public use. I
> considered posting the program to the list (only about 40 lines), but
> didn't know if that was quite appropriate.
>


-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index