Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Joerg Luedicke <joerg.luedicke@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: create unique random number variable |

Date |
Tue, 24 Apr 2012 11:04:26 -0700 |

On further thought, the problem does not seem to be the mixture per se. Of course, we can have a mixture with as many components as we have data points, and that would still be essentially the same as just drawing once. For example, we could type: *--------- clear set obs 100 gen p1=runiform() in 1 forval i=2/100 { replace p1=runiform() in `i' } *--------- and this would still be equivalent to: *--------- gen p2=runiform() *--------- However, drawing again in case of ties may depend on the density of the uniformly distributed data points because ties are more likely to appear in regions (however defined) with higher density and the suggested algorithm then searches for values which are more likely to appear in regions with lower density, which seems to be the reason for the distribution being smoother as compared to the original draw. J. On Tue, Apr 24, 2012 at 10:19 AM, Joerg Luedicke <joerg.luedicke@gmail.com> wrote: > Stas, > > Just out of curiosity: could following this approach still be > described as a strictly random draw (of course, 'strictly' in terms of > pseudo-randomness) from a uniform distribution? Because what > essentially happens is that the randomly emerging ties are filled in > with yet another draw from the uniform. As a consequence, the > resulting integers are drawn from a mixture of several or many uniform > distributions. The component probabilities itself then depend on > randomly emerging ties, so it should not make much different in > practice. However, the resulting distribution looks somewhat smoother > than one might expect (due to being a mixture of k uniforms, I > presume). Compare the following histograms before and after the > redraws (for which I modified your code): > > > //draw from uniform (0,1) > clear > set obs 1000000 > set seed 1234 > generate uu =runiform() > hist uu, name(unif, replace) bin(1000) > > //mapped to integers > clear > set obs 1000000 > set seed 1234 > generate uu = int(1500000*uniform()) > bysort uu: generate byte nonuniq = _n > 1 > hist uu, name(g0, replace) bin(1000) > > //drawing again in case of ties > sum nonuniq > while r(max) > 0 { > bysort uu: replace uu = int(1500000*uniform()) if _n > 1 > bysort uu: replace nonuniq = _n > 1 > sum nonuniq, mean > } > hist uu, name(g1, replace) bin(1000) > > So I don't know what OP's demands are with regard to 'randomness', but > maybe this could matter in some applications? (Perhaps in rocket > science :) ) > > J. > > > On Tue, Apr 24, 2012 at 7:43 AM, Stas Kolenikov <skolenik@gmail.com> wrote: >> On Tue, Apr 24, 2012 at 4:37 AM, raoul reulen <r.c.reulen@gmail.com> wrote: >>> Hello >>> >>> I'm trying to generate a random number variable like this: >>> >>> .set seed 12345 >>> .gen x = int(1000*uniform()) >>> >>> However, the random numbers in variable x are not unique. Is there a >>> way to ensure they are unique? >> >> clear >> set obs 400 >> * this is your sample size >> >> generate uu = int(1000*uniform()) >> bysort uu: generate byte nonuniq = _n > 1 >> sum nonuniq, mean >> while r(max) > 0 { >> bysort uu: replace uu = int(1000*uniform()) if _n > 1 >> bysort uu: replace nonuniq = _n > 1 >> sum nonuniq, mean >> } >> drop nonuniq >> >> -- >> Stas Kolenikov, also found at http://stas.kolenikov.name >> Small print: I use this email account for mailing lists only. >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: create unique random number variable***From:*raoul reulen <r.c.reulen@gmail.com>

**Re: st: create unique random number variable***From:*Stas Kolenikov <skolenik@gmail.com>

**Re: st: create unique random number variable***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

- Prev by Date:
**Re: st: create unique random number variable** - Next by Date:
**RE: st: RE: changing observations under one variable** - Previous by thread:
**Re: st: create unique random number variable** - Next by thread:
**Re: st: create unique random number variable** - Index(es):