Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: create unique random number variable

 From Joerg Luedicke To statalist@hsphsun2.harvard.edu Subject Re: st: create unique random number variable Date Tue, 24 Apr 2012 10:19:38 -0700

```Stas,

Just out of curiosity: could following this approach still be
described as a strictly random draw (of course, 'strictly' in terms of
pseudo-randomness) from a uniform distribution? Because what
essentially happens is that the randomly emerging ties are filled in
with yet another draw from the uniform. As a consequence, the
resulting integers are drawn from a mixture of several or many uniform
distributions. The component probabilities itself then depend on
randomly emerging ties, so it should not make much different in
practice. However, the resulting distribution looks somewhat smoother
than one might expect (due to being a mixture of k uniforms, I
presume). Compare the following histograms before and after the
redraws (for which I modified your code):

//draw from uniform (0,1)
clear
set obs 1000000
set seed 1234
generate uu =runiform()
hist uu, name(unif, replace) bin(1000)

//mapped to integers
clear
set obs 1000000
set seed 1234
generate uu = int(1500000*uniform())
bysort uu: generate byte nonuniq = _n > 1
hist uu, name(g0, replace) bin(1000)

//drawing again in case of ties
sum nonuniq
while r(max) > 0 {
bysort uu: replace uu = int(1500000*uniform()) if _n > 1
bysort uu: replace nonuniq = _n > 1
sum nonuniq, mean
}
hist uu, name(g1, replace) bin(1000)

So I don't know what OP's demands are with regard to 'randomness', but
maybe this could matter in some applications? (Perhaps in rocket
science :)  )

J.

On Tue, Apr 24, 2012 at 7:43 AM, Stas Kolenikov <skolenik@gmail.com> wrote:
> On Tue, Apr 24, 2012 at 4:37 AM, raoul reulen <r.c.reulen@gmail.com> wrote:
>> Hello
>>
>> I'm trying to generate a random number variable like this:
>>
>> .set seed 12345
>> .gen x = int(1000*uniform())
>>
>> However, the random numbers in variable x are not unique. Is there a
>> way to ensure they are unique?
>
> clear
> set obs 400
> * this is your sample size
>
> generate uu = int(1000*uniform())
> bysort uu: generate byte nonuniq = _n > 1
> sum nonuniq, mean
> while r(max) > 0 {
> bysort uu: replace uu = int(1000*uniform()) if _n > 1
> bysort uu: replace nonuniq = _n > 1
> sum nonuniq, mean
> }
> drop nonuniq
>
> --
> Stas Kolenikov, also found at http://stas.kolenikov.name
> Small print: I use this email account for mailing lists only.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```