Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: set seed and gsample


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: set seed and gsample
Date   Mon, 12 Mar 2007 10:38:32 -0000

Agreed. The moral is ancient but important. 
If people do not state their real problem fully
and precisely, the solution offered may not be
appropriate. 

Nick 
n.j.cox@durham.ac.uk 

Ben Jann
 
> Okay, Nick. Of course your statement is the correct answer to the
> quoted text. I already had some background information when I wrote my
> "No". What Shige meant was to get the same sample each time the
> do-file used to draw the sample is run. It is a multistage sample and
> the do-file contains several -gsample- commands. Shige's colleague
> misunderstood your advice and started to insert -set seed- commands
> before each call to -gsample- although the seed should only be set
> once at the beginning of the do file in this situation.
> ben
> 
> On 3/11/07, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> > I say No to your No.
> >
> > Shige's question was
> >
> > "I want to make sure I get the same sample each time
> > I invoke the gsample command".
> >
> > Consider this:
> >
> > . clear
> >
> > . set obs 100
> > obs was 0, now 100
> >
> > . gen i = _n
> >
> > . set seed 280352
> >
> > . gsample 20 , generate(sample1)
> >
> > . gsample 20 , generate(sample2)
> >
> > . assert sample1 == sample2
> > 31 contradictions in 100 observations
> > assertion is false
> > r(9);
> >
> > . set seed 280352
> >
> > . gsample 20 , generate(sample3)
> >
> > . assert sample1 == sample3
> >
> > <NB: no response>
> >
> > Perhaps Shige did not mean what he said, or
> > I am quoting out of context, but
> > but that quoted text was I what responding to.
> >
> > Nick
> > n.j.cox@durham.ac.uk
> >
> > Ben Jann
> >
> > > Some days ago, the following issue concerning -gsample- 
> was posted  by
> > > Shige on behalf of a  colleague:
> > >
> > > > I am trying to draw a PPS sample using the "gsample"
> > > command.  I want
> > > > to make sure I get the same sample each time I invoke 
> the gsample
> > > > command by using the "set seed" command. However, even
> > > after I set the
> > > > random seed using "set seed" command, I still get 
> different sample
> > > > each time. Has anybody encountered this problem?
> > >
> > > Inspection of Shige's colleague's do-file revealed that 
> some -sort-
> > > and -bysort- commands were causing the trouble. It had 
> nothing to do
> > > with -gsample-. -sort- has its own random number 
> generator to break
> > > ties that does not depend on -set seed-. To make -sort- 
> stable either
> > > specify the -stable- option or, better, add a -set 
> sortseed- command
> > > at the beginning of the script (see -help sortseed-).
> > >
> > > Nick wrote:
> > > > You must -set seed- immediately before calling -gsample-.
> > >
> > > No. The seed should be set somewhere in the beginning of 
> the do-file,
> > > before any command that might possibly depend on it (and it should
> > > only be set once - do not set the seed repeatedly in one script).

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index