[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Hypergeometric Distribution

From	Marcello Pagano <[email protected]>
To	[email protected]
Subject	Re: st: RE: Hypergeometric Distribution
Date	Wed, 22 Aug 2007 20:12:23 -0400

The hypergeometric plays a central role in sampling when sampling from a finite population. The binomial provides an approximation for large samples, but why rely on approximations today when they are not necessary? and how good is the approximation, anyway? Possibly the reliance on the approximation provided by the binomial has lulled us into a complacency that contributed to the "evidence since 1999"?

I did research a little with -comb( )- and that works pretty well, but I did a very limited study. A Stata function with all its usual associated robustness and accuracy would be nice, in my opinion.

m.p.

Nick Cox wrote:

There are many answers to this, but dinner supervenes.
If you push hard enough, and show why it is needed, either StataCorp or a user will write a program for this.
The evidence since 1999 has been that such a program is not needed.
Nick [email protected]
Marcello Pagano

Why buy Stata if you are expected to do all this for yourself?

Nick Cox wrote:

Apply -ln()-, -exp()- and -cond()- as needed.
Nick [email protected]
Marcello Pagano

Just concerned with the accuracy.

Nick Cox wrote:

Roger's posting includes what I presume is an allusion to an -egen- function _ghyper.ado that I wrote in 1999.
I withdrew this program as redundant some years ago, given that you can use something like
comb(K, k) * comb(N - K, n - k) / comb(N, n)

wherever you want. In context N, K, n, k may be variables, scalars or placeholders for numeric
constants, or any mixture thereof.
This might need a wrapper to yield zeros where appropriate, or it might need care whenever individual terms get very large, but otherwise
does it raise any problems?
Nick [email protected]
Marcello Pagano

I looked at --ssizebi-- but it seems to be focused on power and sample sizes.

Newson, Roger B wrote:

Thanks to Marcello for telling us all about this
recently-published

algorithm, which looks very useful. A search on

findit hypergeometric

in Stata finds a single reference (to a SSC package), which was
distributed as long ago as 1999. This suggests that the new
algorithm

might be a good candidate for implementation in Mata by
Marcello, or by

anybody else with the time and inclination to do so.

Marcello Pagano

Does anyone have or know of Stata code to calculate the
Hypergeometric
Distribution accurately?

See Journal of Discrete Algorithms , Volume 5 , Issue 2
(June 2007)
Pages: 341-347 for an article by Berkopec, HyperQuick
algorithm for
discrete hypergeometric distribution

<http://portal.acm.org/citation.cfm?id=1240586&coll=GUIDE&dl=GUIDE&CFID=

27443384&CFTOKEN=80678482>.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: RE: Hypergeometric Distribution
  - From: "Nick Cox" <[email protected]>

References:
- RE: st: RE: Hypergeometric Distribution
  - From: "Nick Cox" <[email protected]>

Prev by Date: Re: st: Saving intermediate results (variables) when running -simulate-
Next by Date: st: A simple question about Generalized Linear Model
Previous by thread: RE: st: RE: Hypergeometric Distribution
Next by thread: RE: st: RE: Hypergeometric Distribution
Index(es):
- Date
- Thread