Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: general simulation program (postfile)


From   "Michael Blasnik" <[email protected]>
To   <[email protected]>
Subject   st: Re: general simulation program (postfile)
Date   Fri, 15 Aug 2003 12:41:53 -0400

I have a couple of suggestions to improve or speed up your program.

1)  You could generate 1000 poisson random numbers at a time at the
beginning of the program (instead of just one at a time) and then use a loop
to move n1 n2 and n into local macros (e.g., n1_`i', n2_`i', n_`i').  Then
you could run the main loop referencing these locals.  This might make your
code look a little better and run a little faster.  I would also look into
just grabbing the algorithms from within rndpoi and integrating that into
your loop.

2)  You could replace -nearest- with an alternative that I've written that's
about 10x faster.  -nearest- uses brute force to compare each observation to
every other observation in the dataset one at a time (execution time growing
with n^2).  The approach I used only loops through the obervations once,
calculating a tempvar for distance and sorting the dataset to find the
nearest match.  The execution speed improvement from this approach can be an
order of magnitude.

I tried your program and then ran it again using this -nearest- replacement
and found that, with lambda1=20 and lambda2=30, execution time goes from 230
seconds to 30 seconds.  I then tried a version incorporating my first
suggestion and execution time dropped to 18 seconds.  The speed improvement
from the -nearest- replacement would be even larger with larger values of
lambda (I got tired of waiting with values of 40 and 80)

If you want a copy of nearest2, email me directly and I can send it to you.

Michael Blasnik
[email protected]

----- Original Message ----- 
From: "Benoit Dulong" <[email protected]>
To: "statalist" <[email protected]>
Sent: Friday, August 15, 2003 10:40 AM
Subject: st: general simulation program (postfile)


> The following simulation program will be the first Stata program
> presented to my director at University of Montreal.
> It works, but it is slow, and it feels bizarre...
> (I have to drop _all two times)
>
> Question-1
> In general, is that how a simulation should be done with Stata
> (postfile, post, postclose) ?
>
> Question-2
> To generate a random number n1, I use two syntax lines:
> --------------------
> rndpoi 1 `lambda1'
> local n1 = xp
> --------------------
> Is it possible to do this in only one syntax line ?
> (pass directly the random number to local n1)
>
> Question-3
> It it possible not to drop _all two times ?
> (use subroutine ?)
>
> Thank you.
> Benoit Dulong
>
> ----------------------------------------------------------------------
> MY ALGORITHM
> ----------------------------------------------------------------------
>
> [1]
> generate random number n1
> from a Poisson distribution with parameter lambda1
>
> generate random number n2
> from a Poisson distribution with parameter lambda2
>
> n = n1+n1
>
> [2]
> generate random vector x (n component)
> from uniform distribution (generate x=uniform())
>
> generate random vector y (n component)
> from uniform distribution (generate y=uniform())
>
> [3]
> for each (x[i],y[i]), i=1...n,
> calculate the distance to the nearest neighbor
> using -- nearest -- from N.J COX (NJC 1.1.0 10 January 2003)
> the syntax
> nearest x y, id(id) dist(h)
> gives a file with n rows and 4 columns (x y id h)
>
> [4]
> from the file obtained in [3]
> calculate the mean (mh) and var (vh) of the n values of h
>
> [5]
> repeat [1] to [4] 1000 times
>
> ----------------------------------------------------------------------
> MY PROGRAM
> ----------------------------------------------------------------------
>
> program define myprog1
> version 7.0
> args lambda1 lambda2
>
> postfile mysim1 n1 n2  n mh vh using "myresults1", replace
>
> forvalues i=1(1)1000{
>
> drop _all
>
> display "i= " `i'
>
> rndpoi 1 `lambda1'
> local n1 = xp
>
> rndpoi 1 `lambda2'
> local n2 = xp
>
> local n = `n1' + `n2'
>
> drop _all
> set obs `n'
>
> gen x = uniform()
> gen y = uniform()
> nearest x y, id(id) dist(h)
>
> summarize h
> local mh = r(mean)
> local vh = r(Var)
>
> post mysim1 (`n1') (`n2') (`n') (`mh') (`vh')
>
> }
>
> postclose mysim1
>
> end
>


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index