Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: inconsistent random numbers even using -set seed-


From   Phil Clayton <[email protected]>
To   [email protected]
Subject   Re: st: inconsistent random numbers even using -set seed-
Date   Sat, 1 Feb 2014 09:49:46 +1100

Not quite, because you still have a problem with potential ties:

> isid make, sort
> set seed 1234
> gen rand = runiform()
> 
> bys rep78 (rand) : keep if _n <= _N/2

You're sorting by rand, but there's no guarantee that rand is unique - indeed, in a large dataset, there's a strong chance it won't be. You can reduce the chance of duplicates by making rand double precision, but you should still confirm it's unique:

isid make, sort
set seed 1234
gen double rand = runiform()
isid rep78 rand, missok sort
by rep78: keep if _n <= _N/2

If the second -isid- fails you can add a second random variable:
gen double rand2 = runiform()
isid rep78 rand rand2, missok sort
(etc)

Incidentally my -rsort- command (SSC) does all of this for you (although it looks like I put in a -version 12- statement which is probably a little too strict):
sysuse auto, clear
rsort, id(make) seed(1234) by(rep78)
by rep78: keep if _n <= _N/2

Phil

On 1 Feb 2014, at 1:38 am, Seed, Paul <[email protected]> wrote:

> Thank you also to Phil Schumm.  
> (and Bill Gould for the original post that Phil links to).
> That answers my second question.
> 
> -isid- is useful command that I have not come across before.
> For those who don't know, it checks whether the specified variables uniquely identify the 
> observations; and with the -sort- option carries out a genuinely unique sort.
> 
> I suppose the ideal code for my example is now 
> 
> ***************************
> * Example code showing solution 2*
> 
> version 11.2
> set more off
> sysuse auto, clear
> 
> bys rep78: su  price mpg
> 
> * Crucial change here
> isid make, sort
> set seed 1234
> gen rand = runiform()
> 
> bys rep78 (rand) : keep if _n <= _N/2
> bys rep78: su  price mpg
> 
> * End example *
> ***********************
> 
> 
> Paul T Seed, Senior Lecturer in Medical Statistics, 
> Division of Women's Health, King's College London
> Women's Health Academic Centre, King's Health Partners 
> (+44) (0) 20 7188 3642.
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index