Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Problem with seed and bootstrap


From   Phil Schumm <pschumm@uchicago.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Problem with seed and bootstrap
Date   Mon, 19 Sep 2005 12:16:38 -0500

At 11:45 AM -0500 9/19/05, Richard Williams wrote:
I think what is most troubling about -unstable- is that its results can't be reproduced; you can run the exact same problem twice and get different sort orders.
<snip>

If -unstable- produced reproducible outcomes i think people would feel more comfortable with it.

Reproducibility is indeed the issue here, but it is the use of the -stable- option (more precisely, the reliance on it) that can lead to a result not being reproducible. If -sort- can yield different outcomes (i.e., if the variable(s) you are sorting on does not uniquely identify the observations) *and* if the result(s) you are producing can be affected by this, then that is a programming bug. You may get lucky using -stable-, but in a sense that just moves the problem upstream and makes it less clear to someone reading the code what is going on. Moreover, you have substantially increased the likelihood that future changes to the code will cause problems.

If you are doing something which is dependent upon a certain ordering of the data, then that ordering should be completely and unambiguously specified beforehand. And the use of -sort- (i.e., the unstable version) with sufficient variables to establish that order is the easiest and most concise way of doing that.


-- Phil
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index