Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: inconsistent random numbers even using -set seed-


From   "Seed, Paul" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: inconsistent random numbers even using -set seed-
Date   Wed, 22 Jan 2014 11:02:19 +0000

Dear Statalist, 

I spent several hours yesterday trying to deal with a program that gave 
Inconsistent results when re-run.

Eventually I tracked it down.
As I have never seen this discussed before,  I thought it was worth sharing.

A change to the manual might even be called for.


Here is how it looks:

***************************
* Example code showing problem *

version 11.2
set more off
sysuse auto, clear

bys rep78: su  price mpg

set seed 1234
gen rand = runiform()

bys rep78 (rand) : keep if _n <= _N/2
bys rep78: su  price mpg

* End example *
***********************

If you run this code repeatedly, you will find you do not get 
the same answers to the second list of summaries.

After much trouble I found that the inconsistency depends on the sort order.
The uniform() function is producing exactly the same set of pseudorandom 
numbers each time, and putting them into record 1, 2, 3, 4... as they are produced.

However, sorting by rep78 is only a _partial_ sort.  
The order within each value of rep78 is determined arbitrarily by some 
internal Stata process, and changes each time. So record 1, 2, 3, 4... are 
not the same each time.

To get consistent results, a complete sort is needed.
In this case we can use the fact that each make of car appears once only.
I can use either 
	bys rep78 (make): su  price mpg

or 
	bys rep78: su  price mpg
	sort make

The results will be different depending which I choose, but 
they will not vary from run to run.

***************************
* Example code showing solution 1*

version 11.2
set more off
sysuse auto, clear

* Crucial change here
bys rep78 (make): su  price mpg

set seed 1234
gen rand = runiform()

bys rep78 (rand) : keep if _n <= _N/2
bys rep78: su  price mpg

* End example *
***********************




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index