[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: general simulation program (postfile) |

Date |
Fri, 15 Aug 2003 18:01:26 +0100 |

Benoit Dulong > The following simulation program will be the first Stata program > presented to my director at University of Montreal. > It works, but it is slow, and it feels bizarre... > (I have to drop _all two times) > > Question-1 > In general, is that how a simulation should be done with Stata > (postfile, post, postclose) ? There are lots of ways you could simulate. You need not use -postfile-. Personally, I always try to do stuff in place whenever possible without masses of file I/0. > Question-2 > To generate a random number n1, I use two syntax lines: > -------------------- > rndpoi 1 `lambda1' > local n1 = xp > -------------------- > Is it possible to do this in only one syntax line ? > (pass directly the random number to local n1) Not with -rndpoi-. But then -rndpoi- enforces its way of seeing things upon you. It does not offer the flexibility you need here. Avoid it and use -genpoisson-. > Question-3 > It it possible not to drop _all two times ? > (use subroutine ?) Yes, you can avoid -drop-. > ------------------------------------------------------------ > ---------- > MY ALGORITHM > ------------------------------------------------------------ > ---------- > > [1] > generate random number n1 > from a Poisson distribution with parameter lambda1 > > generate random number n2 > from a Poisson distribution with parameter lambda2 > > n = n1+n1 > > [2] > generate random vector x (n component) > from uniform distribution (generate x=uniform()) > > generate random vector y (n component) > from uniform distribution (generate y=uniform()) > > [3] > for each (x[i],y[i]), i=1...n, > calculate the distance to the nearest neighbor > using -- nearest -- from N.J COX (NJC 1.1.0 10 January 2003) > the syntax > nearest x y, id(id) dist(h) > gives a file with n rows and 4 columns (x y id h) > > [4] > from the file obtained in [3] > calculate the mean (mh) and var (vh) of the n values of h > > [5] > repeat [1] to [4] 1000 times > > ------------------------------------------------------------ > ---------- > MY PROGRAM > ------------------------------------------------------------ > ---------- > > program define myprog1 > version 7.0 > args lambda1 lambda2 > > postfile mysim1 n1 n2 n mh vh using "myresults1", replace > > forvalues i=1(1)1000{ > > drop _all > > display "i= " `i' > > rndpoi 1 `lambda1' > local n1 = xp > > rndpoi 1 `lambda2' > local n2 = xp > > local n = `n1' + `n2' > > drop _all > set obs `n' > > gen x = uniform() > gen y = uniform() > nearest x y, id(id) dist(h) > > summarize h > local mh = r(mean) > local vh = r(Var) > > post mysim1 (`n1') (`n2') (`n') (`mh') (`vh') > > } > > postclose mysim1 > > end > Various thoughts: 1. You are using -post-. Not essential. 2. You are invoking a program -rndpoi- to generate 2000 randoms from a Poisson 2000 times. You can call -genpoisson- twice. 3. 2000 -drop _all- can be avoided. 4. -display- serves no purpose beyond showing progress of program. Once it's working, you can cut that. 5. Statements like local n1 = xp hinge on the accident that you have just one observation in memory. Dangerous style. 6. Invoking -nearest- 1000 times imposes an overhead of interpretation. Also you are forcing it to generate identifiers which you never use. Read the code into your program and cut all surplus. Here the result of a bit of hacking around. program define myprog2 version 7.0 args lambda1 lambda2 clear set obs 1000 genpoisson n1, mu(`lambda1') genpoisson n2, mu(`lambda2') gen n = n1 + n2 qui { gen which = _n expand n gen x = uniform() gen y = uniform() gen dist = . gen mh = . gen vh = . sort which gen long obs = _n forval i = 1/1000 { noi su obs if which == `i', meanonly local imin = r(min) local imax = r(max) forval j = `imin'/`imax' { scalar mind = . forval k = `imin'/`imax' { if (`j' != `k') { scalar d = /* */ (x[`j'] - x[`k'])^2 + (y[`j'] - y[`k'])^2 scalar mind = /* */ min(scalar(d), mind) } } replace dist = sqrt(mind) in `j' } summarize dist in `imin'/`imax', detail replace mh = r(mean) in `imin' replace vh = r(Var) in `imin' } bysort which (mh) : keep if _n == 1 drop obs x y dist } end Comments on that: get all the Poissons at once; less fishing time what if n is 0? I think it goes OK get your randoms on unit square at once do things -in- rather than -if- wherever possible: by Blasnik's First Law, that's a lot faster as Michael points out in his email which I've just seen, the -nearest- algorithm is very crude. Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: general simulation program (postfile)***From:*Benoit Dulong <bendulong@vdn.ca>

- Prev by Date:
**st: Re: general simulation program (postfile)** - Next by Date:
**st: RE: ivprob with clustered SEs?** - Previous by thread:
**st: Re: general simulation program (postfile)** - Next by thread:
**st: Re: OSX 10.2.6 crash - STATA 7** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |