Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Default Seed of Stata 12 |

Date |
Fri, 26 Oct 2012 13:44:43 -0500 |

Pretty much every number is as good as any other number for a starting value. Stata quietly cycles 100 states to get away from your starting value, anyway. The various pieces of advice I've seen regarding the starting values (including probably some statalist discussions) include: 1. use today's date: set seed 20121026 2. pull a bill out of your pocket, and copy its numbers 3. take a look at your RSA key and use the digits from there (sh-h-h... I hope my IT department is not listening to this) 4. use an actual random number from random.org 5. use a Dilbert-like random number generator (http://dilbert.com/strips/comic/2001-10-25/) The way I typically set up my simulations is to have a workhorse file that takes something like args n seed eye_color hair_color log using simulation-`c(current_date)'-`n'-`seed'-`eye_color'-`hair_color' where `n' would be the number of observations to create, `seed' is obviously the random seed, and the rest are the parameters of the data generation process. I would try it with an obviously human produced parameters like do workhorse 111 10101 orange purple (provided, of course, that my file will know what to do with these parameters), and then for my actual simulation on a cluster, I would produce a wrapper === args seed foreach n of numlist 100 200 500 { foreach eye in green blue brown { foreach hair in blond black brown { do workhorse `n' `seed' `eye' `hair' } } } === and then create a few dozen single line do-files with just "do wrapper <seed>"; I would produce them automatically with -file- command, and even launch them with the OS execution utility. So if I bothered too much about the seeds, I would never be able to set this up computationally efficiently :). -- -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at srbi dot com -- Opinions stated in this email are mine only, and do not reflect the position of my employer On Fri, Oct 26, 2012 at 1:25 PM, <S.Jenkins@lse.ac.uk> wrote: > Bill Gould wrote a very informative post about Stata's seed on Wed 24 > Oct. > > In part, he wrote: > ========================== > Think of the random-number generator as producing an infinitely long > sequence of states: > > ------------------------------------------------------------------------ > - > state0 -> state1 -> state2 -> ... -> state{2^124} -> state0 -> > state1 ... > > where, > > state0 = X075bcd151f123bb5159a55e50022865700043e55, > > state1 = X5b15215854f24767556efaba82801d9b0004330a, > > and so on, > > and where the i-th pseudo random number is given by g(state{i}). > > ------------------------------------------------------------------------ > - > > The sequence may be infinitely long, but it repeats. The period is > approximately 2^124 in the case of KISS. > > The easy-to-type 32-bit seed provides 2^32 entry points into this > sequence > > --------------------------------------------------------------------- > state0 -> state1 -> ... -> state{2^96) -> ... -> state{2^124) -> ... > | | | > 123456789 ???????? ?????? > --------------------------------------------------------------------- > ======================== > > Given the "infinitely long" sequence which repeats, and Bill's reference > to "entry points", does it ever matter what number one chooses to be the > initial seed and hence enters the sequence? > > I note that the default Stata 32-bit seed is "123456789", which is 9 > digits and an odd number. Are there potentially adverse consequences of > setting a 32-bit seed using an even number? Or using a seed that is less > than some critical number of digits in length? E.g. is "1" or "20" as > good as "123456789" or "987654321"? > > Many people, including me, appear to use a number with -set seed- that > has a relatively large number of digits and is an odd number -- but I > wonder if this is simply custom and practice, or whether there is a > rationale. Or is any number as good as another as an entry point to the > sequence? I searched the web for answers a while ago and did not find > answers. > > > Stephen > ------------------ > Professor Stephen P. Jenkins <s.jenkins@lse.ac.uk> > Department of Social Policy and STICERD > London School of Economics and Political Science > Houghton Street, London WC2A 2AE, UK > Tel: +44(0)20 7955 6527 > Changing Fortunes: Income Mobility and Poverty Dynamics in Britain, OUP > 2011, http://ukcatalogue.oup.com/product/9780199226436.do > Survival Analysis Using Stata: > http://www.iser.essex.ac.uk/survival-analysis > Downloadable papers and software: http://ideas.repec.org/e/pje7.html > > > > Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Default Seed of Stata 12***From:*<S.Jenkins@lse.ac.uk>

- Prev by Date:
**Re: st: Thread-Index: Ac2zpwBrnny9GG5FQQ6Vz7stZfQOCw==** - Next by Date:
**Re: st: Subtract Closest Cell Which has A Value** - Previous by thread:
**st: Default Seed of Stata 12** - Next by thread:
**Re: st: Default Seed of Stata 12** - Index(es):