Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Saving intermediate results (variables) when running -simulate-


From   "Rodrigo A. Alfaro" <raalfaroa@gmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Saving intermediate results (variables) when running -simulate-
Date   Thu, 23 Aug 2007 11:18:38 -0400

Rachel,

Maarten gave you the efficient answer: set seed. With that you can reproduce exactly the same simulations in the future and change whatever you want in the analysis of the simulated datasets.

Alternative, but inefficiently, you could divide the process in 2 steps: (1) generate a general dataset, (2) do the analysis using loops. It is inefficient for 2 reasons: you are storing simulated datasets (trash) in your hard drive and a similar process (generating the data, doing the analysis) will take much more time here than using -simulate-.

I don't know under what conditions the alternative is actually necessary. Maybe your model is so far from the standard uniform or multivariate normal, then you have to compute several transformations. For example, you are interested in running y against x, and doing analysis of that relationship (prediction, etc). But y and x must be generated in a very long procedure that it is time-consuming.

Rodrigo.



----- Original Message ----- From: "Maarten buis" <maartenbuis@yahoo.co.uk>
To: <statalist@hsphsun2.harvard.edu>
Sent: Thursday, August 23, 2007 3:04 AM
Subject: Re: st: Saving intermediate results (variables) when running -simulate-



--- Rachel <academicgirl@gmail.com> wrote:
If I understand this correctly, the estimates resulting from the
simulation below should be the same for all 20 repetitions if the
same seed is used for each one.
That would be true, but that is not what I have done in the example, or
what I advised you to do. If you look at the example (and actually
execute it, you will see it from the results) you will see I set the
seed only twice: once for generating the original 20 simulations, and
once for reproducing the 20 datasets. So I proposed that you set the
seed, than create 20, or 100, 1000, or whatever number of different
random datasets. In that case you can recreate all those datasets just
by setting that original seed again. What I did not propose is that you
set the seed for each random dataset is created, as that would indeed
lead to 20 identical datasets. In other words, -set seed 12345- should
appear outside the loop not inside the loop.

Would it make sense to set the seed itself to be a random number and
then return the seed as a scalar so that the rest of the dataset
could be reproduced?
No, that is unnecesary.



-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------


___________________________________________________________
Want ideas for reducing your carbon footprint? Visit Yahoo! For Good http://uk.promotions.yahoo.com/forgood/environment.html
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index