Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: repeatedly shuffle number sequence |

Date |
Tue, 25 Oct 2011 09:45:55 +0100 |

This sounds like the sort of problem in which you can spend more time working out the most efficient way to do it than actually doing it. You can answer your own question by timings with numbers of observations and variables close to what you will be using. My own instinct is to wonder about creating a long dataset with one variable divided into blocks and then finally doing a -reshape wide- but these days -sort-s are pretty fast in Stata unless your dataset is enormous. Nick On Tue, Oct 25, 2011 at 9:08 AM, Clinton Thompson <clintonjthompson@gmail.com> wrote: > I'm using Stata/SE 11.2 for WIndows. > > This is a question that is part programming, part efficiency, and part > style. Consider a sequence of numbers, say [1,10], that I want to > shuffle/randomize several times such that I end up w/ k variables > where each of the variables created contains a random shuffling of the > values [1,10]. I approached this using a rather simple and > rudimentary -foreach- loop: > >>>>>>>>>>>>>> BEGIN >>>>>>>>>>> > > clear > set obs 10 > set seed 20111025 > > foreach num of numlist 1/5 { > gen int seq`num' = _n > gen rand`num' = runiform() > sort rand`num' > drop rand`num' > } > > <<<<<<<<<< END <<<<<<<<<<<<< > > This approach works -- in the sense that k variables are created where > each variable contains a random shuffling of the numbers from 1-10 -- > but I'm not sure if this the best way to approach this kind of > problem. Does the creation of a -wide- dataset (as in my approach) > make the most sense (I'll be expanding this to 20-25 variables instead > of the 5 currently programmed)? And I can easily change the sequences > of the values for all of the seq* variables depending on which of the > rand* variables is sorted but this doesn't seem too robust. Any > thoughts or advice on whether this is the best (read: correct and > most efficient?) approach to this problem is most appreciated. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: repeatedly shuffle number sequence***From:*Clinton Thompson <clintonjthompson@gmail.com>

**References**:**st: repeatedly shuffle number sequence***From:*Clinton Thompson <clintonjthompson@gmail.com>

- Prev by Date:
**Re: st: Graph Mean SD of continuous variable over categorical variable - but with a twist** - Next by Date:
**Re: st: RE: using encode to order string distances** - Previous by thread:
**st: repeatedly shuffle number sequence** - Next by thread:
**Re: st: repeatedly shuffle number sequence** - Index(es):