[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
n j cox <[email protected]> |

To |
[email protected] |

Subject |
Re: st: Expert Opinion - An alternative to rndbin |

Date |
Mon, 24 Sep 2007 19:23:53 +0100 |

-rndbin-

========

I posted at length on -rndbin- a few days ago in a post which

Tiago acknowledged. However, it seems that I didn't make myself

very clear. The only reason I can think of for using -rndbin- is that you are not aware of an alternative. (That could be decisive; many users would understandably grudge the time taken to work out an alternative, especially if they are not fluent in Stata. Life is short, and so forth.)

But I gave direct alternatives in my posting. By comparison,

-rndbin- is slow, it is incompletely documented, and it can bite you

because of the way that it messes with -set-. It also requires you

to -rename- its result whenever you only want more than one

binomial random variable in memory.

Bernoulli

=========

In Tiago's example here the binomial collapses to the Bernoulli

and simulation similarly collapses to

. gen xb = uniform() < 0.25

To repeat the moral of last week's posting, no program is

necessary when you can do it directly. (And in this case, I see no

gain in using Mata.) The number of events (whatever is coded 1) is

then quickly accessible as r(sum) after

. su xb, meanonly

or using -count-, as in Tiago's post.

Coding directly is always going to be faster than -rndbin-.

My experiments indicate that it is about 5 times faster than

-rndbin-. That could matter to you but it is far less of a

problem than Tiago implies. If Tiago is experiencing a difference

between seconds and hours, then something else is biting, perhaps

a memory problem, but not the speed of -rndbin-. -rndbin-

really isn't that slow, just slower than it need be.

Normal approximation

====================

Tiago quotes the normal approximation to the binomial. I am not

clear how this is relevant. If you know, or assume, something is

binomial, then its mean, SD and other properties follow directly

and there is no need for simulation. If you want the random

numbers (0s and 1s) directly, then as the Bernoulli is (very!)

discrete, random normal deviates would seem inappropriate.

Normal randoms are in any case no faster to get than Bernoulli

randoms, so saving time is not an issue.

Nick

[email protected]

Tiago Pereira

I would like to know your expert opinion about a faster approach to

generate binomial random numbers using Stata. I wish to know if the

approach below is valid and if it makes some sense.

Well, for example, assume we have _N=1. Then, we have 25000 observations

and p= 0.25, which is the probability of the event. -rndbin- is quite

straighforward:

rndbin 25000 0.25 1

qui count if xb==1

However, I have to run -rndbin- millions of times, say, 10^15 times, count

the number of events (xb=1) and then summarize it to get a new variable.

That approach takes a lot of time and even using Mata functions this is

time-consuming.

Taking statistical aspects of the binomial distribution into account, may

I approximate that calculation using the following approach?

p = 0.25

observations = 25000

sd_of_p_hat = standard deviation of the p_hat

gene sd_of_p_hat= sqrt(((p)*(1-p))/(observations))

generate z = invnorm(uniform())

replace p = (z)*(sd_of_p_hat)+(p)

gene number_of_events= round(p*observations)

The latter approach is really faster (2-3 seconds for 100000 studies,

whereas -rndbin- are likely to take some hours, at least in my PC) and it

is likely to be unbiased for p�s between 0.3 and 0.7, the range I have to

work with in Human Genetics.

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: Random number routines [was: st: Simple Question - Use of rndbin]** - Next by Date:
**Re: st: loops** - Previous by thread:
**st: dtobit comand in STATA 9.2** - Next by thread:
**st: Expert Opinion - An alternative to rndbin** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |