[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
jpitblado@stata.com (Jeff Pitblado, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Nonuniform Random Variable Generation |

Date |
Tue, 27 Sep 2005 00:49:55 -0500 |

Jann Ben <ben.jann@soz.gess.ethz.ch> asks about generating random variables from a given distribution: > I would like to generate random variables with the density > > f(x) = 1/2 * (2x^2 + 4x + 1) * exp(-2x), x>=0 > > Can anyone help me with this? Fair warning -- the following reply is rather long, consisting of Mata code, a couple of do-files and an ado-file. I couldn't help myself. I used Stata to draw the function, just to get a feel for its shape; then I integrated f(z) from 0 to x to get F(x), the CDF of f(x): F(x) = 1 - 0.5 * (x + 1) * (x + 2) * exp(-2*x) Now all I need to do is invert F(x) to get the quantile function of f(x); once I have the quantile function of F(x)--invF(x) say--then I can feed it random uniforms to get randomly generated quantiles from the "f(x)" distribution. Unfortunately, if there is a closed form for the inverse function of F(x), I'd be hard pressed to produce it. Looks like we'll have to solve for invF(x) numerically. This is a great problem for Mata to solve. So I decided to write my own Newton-Raphson root finder (note that I could have used bisection or something else, but I have a particular fondness for this algorithm). A quick google search will uncover several sites that explain the basic Newton-Raphson algorithm (for those who don't have a Numerical Analysis book handy). The basic idea is to use a first order Taylor expansion of F(x) (the function we want to invert) to develop an update equation which we iterate until convergence (or we get tired of waiting). Start with an initial guess x0, the new guess x1 is then computed using x1 = x0 - F(x0)/F'(x0) = x0 - F(x0)/f(x0) In our case the root we really want to find is for G(x) = F(x) - y; where y is a (randomly generated) value in the range of F(x). So our update equation is x1 = x0 - ( F(x0) - y )/f(x0) Now that we have all the mathematical statistics out of the way, let's write some code. Here is the do-file for my Mata function -mynrinverter()-. The comments document what I did here. Notice that two of the arguments to my Mata function are pointers to the two functions F() and f(); I could have hard coded F() and f(), but decided to go for generality--it really didn't cost me much, but will pay dividends when I want to use -mynrinverter()- on a similar problem at a later date. ***** BEGIN: mynrinverter.do * mynrinverter.do mata : mata clear mata set matastrict on real scalar mynrinverter( real scalar y, // y = f(x) pointer(real scalar function) f, // f() -- the function pointer(real scalar function) df, // f'() -- f()'s derivative real scalar maxiter, // maximum iterations allowed real scalar epsilon // convergence criterion ) { real scalar x0 // the current guess real scalar x1 // the updated guess real scalar delta // = [ f(x0) - y ]/f'(x0) real scalar iter // iteration counter // input missing => output missing if (missing(y)) return(.) // set the default value for the maximum iterations if (missing(maxiter) | maxiter < 1) { maxiter = 100 } // set the default value for the convergence criterion if (missing(epsilon) | epsilon <= 0) { epsilon = 1e-9 } // we will use the input value as our initial guess; this is probably // not a good initial guess for all problems, but it will be find for // the problem at hand x0 = y x1 = y // set the step value to the initial guess to make sure we get inside // the loop delta = x0 iter = 0 while (abs(delta/x0) > epsilon) { // update the current guess x0 = x1 // get the new step value delta = ((*f)(x0) - y)/(*df)(x0) if (missing(delta)) return(.) // get the new guess x1 = x0 - delta // step the iteration counter and display some information // about our progress ++iter printf("iter = %4.0f x1 = %9.6f f(x1) = %9.6f delta = %9.6f\n", iter, x1, (*f)(x1), delta) if (iter >= maxiter) return(.) } // return the value we converged to return(x1) } mata mosave mynrinverter, replace end ***** END: mynrinverter.do I can now run the above do-file to create an mo-file. If I type, . do mynrinverter Stata will create mynrinvert.mo, then I can use -myrninverter()- in any of my Mata sessions so long as mynrinvert.mo is somewhere in my adopath--just like ado-files. I then wrote the following do-file to test things out: ***** BEGIN: test.do clear // build mynrinverter.mo run mynrinverter ls *.mo mata: // F'(x) -- the density function Ben originally asked about real scalar dfx(real scalar x) { if (x < 0) return(.) return( ( (x + 1)^2 - 0.5 ) * exp(-2*x) ) } mata mosave dfx, replace // F(x) -- the CDF of Ben's density function real scalar fx(real scalar x) { if (x < 0) return(.) return( 1 - 0.5 * (x + 1) * (x + 2) * exp(-2*x) ) } mata mosave fx, replace // Problem: find x, such that Pr(X < x) = 0.3 x = mynrinverter(.3, &fx(), &dfx(), ., .) // make sure I get y = 0.3 back from the 'x' -mynrinverter()- found y = fx(x) x, y // save the answer so I can graph it later st_numscalar("answer", x) end // mata: // graph f(), F(), and visually confirm that I got a reasonable answer from // -mynrinverter()- tw function dfx = ( (x + 1)^2 - 0.5 ) * exp(-2*x), /// range(0 5) /// || function fx = 1 - 0.5 * (x + 1) * (x + 2) * exp(-2*x), /// range(0 5) /// yline(.3) xline(`=scalar(answer)') graph export test.eps, replace exit ***** END: test.do Now that I have a way to get the value of invF(u) for some random uniform value, all I need is a way to generate a Stata dataset. Now that -mynrinverter()- is available in the form of an mo-file, we can use the Mata function it contains in an ado-file. Here is an example: ***** BEGIN: mynrquantiles.ado program mynrquantiles version 9.1 syntax varname(numeric) [if] [in], /// gen(name) /// f(name) /// Mata function name for f() df(name) /// Mata function name for f'() [ /// maxiter(real -1) /// epsilon(real -1) /// ] confirm new var `gen', exact marksample touse quietly gen `gen' = . quietly mata : mynrquantiles(&`f'(), &`df'(),`maxiter',`epsilon') end mata : mata set matastrict on // NOTE: We could even put the code for -mynrinverter()- here, but that would // be a shame when we could put it in a library so that other's can use it // too. The following Mata function is intimately related to this ado-file so // it really does belong here. void mynrquantiles( pointer(real scalar function) f, // f() -- the function pointer(real scalar function) df, // f'() -- f()'s derivative real scalar maxiter, // maximum iterations allowed real scalar epsilon // convergence criterion ) { real matrix data, nobs, i string scalar x, y, touse // x is the output variable x = st_local("gen") // y is the input variable y = st_local("varlist") // identify relevant obs. touse = st_local("touse") // use a view so we can update the Stata dataset with the answers st_view(data, ., (x, y), touse) // loop through the data to get the quantiles nobs = rows(data) for(i=1; i<=nobs; i++) { data[i,1] = mynrinverter(data[i,2],f,df,maxiter,epsilon) } } end exit ***** END: mynrquantiles.ado Here is small do-file to test that -mynrquantiles- works properly. It basically generates some data using -mynrquantiles- and graphs it to visually checks that the resulting quantiles are valid. ***** BEGIN: test2.do clear // generate a grid of 30 evenly spaced probabilities from 0 to 1; stop short // of 1, since the anser is x = infinity set obs 30 gen y = (_n-1)/_N // use -mynrquantiles- to generate to x values associated with the above y // values mynrquantiles y, gen(x) f(fx) df(dfx) // graph the results, the scatter plot should fall on the plot of F(x) tw function fx = 1 - 0.5 * (x + 1) * (x + 2) * exp(-2*x), /// range(0 5) /// || scatter y x graph export test2.eps, replace exit ***** END: test2.do Cheers, --Jeff Pitblado jpitblado@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Nonuniform Random Variable Generation***From:*"Scott Merryman" <smerryman@kc.rr.com>

- Prev by Date:
**st: historical maximum** - Next by Date:
**st: -akdensity- update available on SSC** - Previous by thread:
**st: Nonuniform Random Variable Generation** - Next by thread:
**RE: st: Nonuniform Random Variable Generation** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |