Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Nonuniform Random Variable Generation

From (Jeff Pitblado, StataCorp LP)
Subject   Re: st: Nonuniform Random Variable Generation
Date   Tue, 27 Sep 2005 00:49:55 -0500

Jann  Ben <> asks about generating random variables
from a given distribution:

> I would like to generate random variables with the density
>   f(x) = 1/2 * (2x^2 + 4x + 1) * exp(-2x),  x>=0
> Can anyone help me with this?

Fair warning -- the following reply is rather long, consisting of Mata code, a
couple of do-files and an ado-file.  I couldn't help myself.

I used Stata to draw the function, just to get a feel for its shape; then I
integrated f(z) from 0 to x to get F(x), the CDF of f(x):

	F(x) = 1 - 0.5 * (x + 1) * (x + 2) * exp(-2*x)

Now all I need to do is invert F(x) to get the quantile function of f(x); once
I have the quantile function of F(x)--invF(x) say--then I can feed it random
uniforms to get randomly generated quantiles from the "f(x)" distribution.

Unfortunately, if there is a closed form for the inverse function of F(x), I'd
be hard pressed to produce it.  Looks like we'll have to solve for invF(x)
numerically.  This is a great problem for Mata to solve.

So I decided to write my own Newton-Raphson root finder (note that I could
have used bisection or something else, but I have a particular fondness for
this algorithm).

A quick google search will uncover several sites that explain the basic
Newton-Raphson algorithm (for those who don't have a Numerical Analysis book
handy).  The basic idea is to use a first order Taylor expansion of F(x) (the
function we want to invert) to develop an update equation which we iterate
until convergence (or we get tired of waiting).  Start with an initial guess
x0, the new guess x1 is then computed using

	x1 = x0 - F(x0)/F'(x0) = x0 - F(x0)/f(x0)

In our case the root we really want to find is for G(x) = F(x) - y; where y is
a (randomly generated) value in the range of F(x).  So our update equation is

	x1 = x0 - ( F(x0) - y )/f(x0)

Now that we have all the mathematical statistics out of the way, let's write
some code.

Here is the do-file for my Mata function -mynrinverter()-.  The comments
document what I did here.  Notice that two of the arguments to my Mata
function are pointers to the two functions F() and f(); I could have hard
coded F() and f(), but decided to go for generality--it really didn't cost me
much, but will pay dividends when I want to use -mynrinverter()- on a similar
problem at a later date.

***** BEGIN:
mata :

mata clear
mata set matastrict on

real scalar mynrinverter(
	real scalar y,				// y = f(x)
	pointer(real scalar function) f,	// f() -- the function
	pointer(real scalar function) df,	// f'() -- f()'s derivative
	real scalar maxiter,			// maximum iterations allowed
	real scalar epsilon			// convergence criterion
	real scalar x0				// the current guess
	real scalar x1				// the updated guess
	real scalar delta			// = [ f(x0) - y ]/f'(x0)
	real scalar iter			// iteration counter

	// input missing => output missing
	if (missing(y)) return(.)

	// set the default value for the maximum iterations
	if (missing(maxiter) | maxiter < 1) {
		maxiter = 100
	// set the default value for the convergence criterion
	if (missing(epsilon) | epsilon <= 0) {
		epsilon = 1e-9

	// we will use the input value as our initial guess; this is probably
	// not a good initial guess for all problems, but it will be find for
	// the problem at hand
	x0 = y
	x1 = y
	// set the step value to the initial guess to make sure we get inside
	// the loop
	delta = x0
	iter = 0
	while (abs(delta/x0) > epsilon) {
		// update the current guess
		x0 = x1
		// get the new step value
		delta = ((*f)(x0) - y)/(*df)(x0)
		if (missing(delta)) return(.)
		// get the new guess
		x1 = x0 - delta
		// step the iteration counter and display some information
		// about our progress
		printf("iter = %4.0f x1 = %9.6f f(x1) = %9.6f delta = %9.6f\n",
			iter, x1, (*f)(x1), delta) 
		if (iter >= maxiter) return(.)
	// return the value we converged to
mata mosave mynrinverter, replace

***** END:

I can now run the above do-file to create an mo-file.  If I type,

	. do mynrinverter

Stata will create, then I can use -myrninverter()- in any of my
Mata sessions so long as is somewhere in my adopath--just like

I then wrote the following do-file to test things out:

***** BEGIN:

// build
run mynrinverter
ls *.mo


// F'(x) -- the density function Ben originally asked about
real scalar dfx(real scalar x)
	if (x < 0) return(.)
	return( ( (x + 1)^2 - 0.5 ) * exp(-2*x) )
mata mosave dfx, replace

// F(x) -- the CDF of Ben's density function
real scalar fx(real scalar x)
	if (x < 0) return(.)
	return( 1 - 0.5 * (x + 1) * (x + 2) * exp(-2*x) )
mata mosave fx, replace

// Problem:  find x, such that Pr(X < x) = 0.3
x = mynrinverter(.3, &fx(), &dfx(), ., .)
// make sure I get y = 0.3 back from the 'x' -mynrinverter()- found
y = fx(x)
x, y
// save the answer so I can graph it later
st_numscalar("answer", x)

end	// mata:

// graph f(), F(), and visually confirm that I got a reasonable answer from
// -mynrinverter()-
tw	function dfx = ( (x + 1)^2 - 0.5 ) * exp(-2*x),		///
		range(0 5)					///
||	function fx = 1 - 0.5 * (x + 1) * (x + 2) * exp(-2*x),	///
		range(0 5)					///
		yline(.3) xline(`=scalar(answer)')

graph export test.eps, replace

***** END:

Now that I have a way to get the value of invF(u) for some random uniform
value, all I need is a way to generate a Stata dataset.  Now that
-mynrinverter()- is available in the form of an mo-file, we can use the Mata
function it contains in an ado-file.  Here is an example:

***** BEGIN: mynrquantiles.ado
program mynrquantiles
	version 9.1
	syntax varname(numeric) [if] [in],	///
		gen(name)			///
		f(name)				/// Mata function name for f()
		df(name)			/// Mata function name for f'()
		[				///
			maxiter(real -1)	///
			epsilon(real -1)	///

	confirm new var `gen', exact
	marksample touse
	quietly gen `gen' = .
	quietly mata : mynrquantiles(&`f'(), &`df'(),`maxiter',`epsilon')

mata :
mata set matastrict on

// NOTE:  We could even put the code for -mynrinverter()- here, but that would
// be a shame when we could put it in a library so that other's can use it
// too.  The following Mata function is intimately related to this ado-file so
// it really does belong here.

void mynrquantiles(
	pointer(real scalar function) f,	// f() -- the function
	pointer(real scalar function) df,	// f'() -- f()'s derivative
	real scalar maxiter,			// maximum iterations allowed
	real scalar epsilon			// convergence criterion
	real matrix data, nobs, i
	string scalar x, y, touse

	// x is the output variable
	x	= st_local("gen")
	// y is the input variable
	y	= st_local("varlist")
	// identify relevant obs.
	touse	= st_local("touse")
	// use a view so we can update the Stata dataset with the answers
	st_view(data, ., (x, y), touse)
	// loop through the data to get the quantiles
	nobs = rows(data)
	for(i=1; i<=nobs; i++) {
		data[i,1] = mynrinverter(data[i,2],f,df,maxiter,epsilon)

***** END: mynrquantiles.ado

Here is small do-file to test that -mynrquantiles- works properly.  It
basically generates some data using -mynrquantiles- and graphs it to visually
checks that the resulting quantiles are valid.

***** BEGIN:

// generate a grid of 30 evenly spaced probabilities from 0 to 1; stop short
// of 1, since the anser is x = infinity
set obs 30
gen y = (_n-1)/_N

// use -mynrquantiles- to generate to x values associated with the above y
// values
mynrquantiles y, gen(x) f(fx) df(dfx)

// graph the results, the scatter plot should fall on the plot of F(x)
tw	function fx = 1 - 0.5 * (x + 1) * (x + 2) * exp(-2*x),	///
		range(0 5)					///
||	scatter y x

graph export test2.eps, replace

***** END:


--Jeff Pitblado
*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index