Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: repeatedly shuffle number sequence


From   Clinton Thompson <clintonjthompson@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: repeatedly shuffle number sequence
Date   Tue, 25 Oct 2011 11:55:38 +0200

Point well-taken, Nick.
Many thanks,
Clint

On Tue, Oct 25, 2011 at 10:45 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> This sounds like the sort of problem in which you can spend more time
> working out the most efficient way to do it than actually doing it.
> You can answer your own question by timings with numbers of
> observations and variables close to what you will be using. My own
> instinct is to wonder about creating a long dataset with one variable
> divided into blocks and then finally doing a -reshape wide- but these
> days -sort-s are pretty fast in Stata unless your dataset is enormous.
>
> Nick
>
> On Tue, Oct 25, 2011 at 9:08 AM, Clinton Thompson
> <clintonjthompson@gmail.com> wrote:
>
>> I'm using Stata/SE 11.2 for WIndows.
>>
>> This is a question that is part programming, part efficiency, and part
>> style.  Consider a sequence of numbers, say [1,10], that I want to
>> shuffle/randomize several times such that I end up w/ k variables
>> where each of the variables created contains a random shuffling of the
>> values [1,10].  I approached this using a rather simple and
>> rudimentary -foreach- loop:
>>
>>>>>>>>>>>>>>> BEGIN >>>>>>>>>>>
>>
>> clear
>> set obs 10
>> set seed 20111025
>>
>> foreach num of numlist 1/5 {
>>  gen int seq`num' = _n
>>  gen rand`num' = runiform()
>>  sort rand`num'
>>  drop rand`num'
>> }
>>
>> <<<<<<<<<< END <<<<<<<<<<<<<
>>
>> This approach works -- in the sense that k variables are created where
>> each variable contains a random shuffling of the numbers from 1-10 --
>> but I'm not sure if this the best way to approach this kind of
>> problem.  Does the creation of a -wide- dataset (as in my approach)
>> make the most sense (I'll be expanding this to 20-25 variables instead
>> of the 5 currently programmed)?  And I can easily change the sequences
>> of the values for all of the seq* variables depending on which of the
>> rand* variables is sorted but this doesn't seem too robust.  Any
>> thoughts or advice on whether this is the best (read:  correct and
>> most efficient?) approach to this problem is most appreciated.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index