See the latest version of Bayesian models. See all of Stata's Bayesian analysis features.

See the new features in Stata 15.

Stata's **bayesmh** provides a variety of built-in Bayesian models
for you to choose from; see the full list of available likelihood models and prior distributions. Cannot
find the model you need? **bayesmh** also provides facilities for
you to program your own Bayesian models.

Adding Bayesian models is easy. Simply write a Stata program that computes a
posterior density following **bayesmh**'s convention and specify the name of the
program with the command. **bayesmh** will do the rest: it will simulate the
marginal posterior distributions for all parameters and provide posterior
summaries. After the simulation, your subsequent analyses are exactly the
same as if you used one of the built-in models.

You can write a program that computes the overall log likelihood and choose priors from a list of built-in prior distributions or you can write a program that computes the entire log posterior.

As a simple example, let's write a program that computes a log likelihood for a logistic regression model.

program logitll version 14.1 args lnf xb // compute log likelihood tempvar lnfj quietly generate double `lnfj' = ln(invlogit( `xb')) /// if $MH_y == 1 & $MH_touse quietly replace `lnfj' = ln(invlogit(-`xb')) if $MH_y == 0 & $MH_touse quietly summarize `lnfj', meanonly scalar `lnf' = r(sum) if r(N) < $MH_n { scalar `lnf' = . exit } end

We called our program **logitll**. The program has two arguments: a temporary
scalar **lnf** for storing a value of the overall likelihood and a temporary
variable **xb** that contains the linear predictor evaluated at the current
values of parameters. The parameters are regression coefficients in our
example.

Global macro **MH_y** contains the name of the binary dependent variable,
**MH_touse** identifies the estimation sample, and **MH_n** contains the total
number of observations.

We created an intermediate temporary variable **lnfj** to contain the
observation-specific log likelihood. The sum of this variable is the
overall log likelihood, which we stored in **lnf**. If the log-likelihood
computation evaluates to missing in at least one observation, the
overall likelihood **lnf** is set to missing.

We can now use our program with **bayesmh** to fit a Bayesian logistic
regression model.

Suppose that we want to model whether a car is foreign or domestic as a
function of car mileage. We specify the name of our log-likelihood evaluator
in option **llevaluator()** and specify one of the built-in priors in option
**prior()**. We use the flat prior for both coefficients, **mpg** and
**_cons**.

. bayesmh foreign mpg, llevaluator(logitll) prior({foreign:}, flat)Burn-in ... Simulation ... Model summary

Likelihood: | ||

foreign ~ logitll(xb_foreign) | ||

Prior | ||

{foreign:mpg _cons} ~ 1 (flat) (1) | ||

(1) Parameters are elements of the linear form xb_foreign. | ||

Bayesian regression MCMC iterations = 12,500 | ||

Random-walk Metropolis-Hastings sampling Burn-in = 2,500 | ||

MCMC sample size = 10,000 | ||

Number of obs = 74 | ||

Acceptance rate = .2714 | ||

Efficiency: min = .09526 | ||

avg = .0956 | ||

Log marginal likelihood = -41.616161 max = .09594 | ||

Equal-tailed | ||

foreign | Mean Std. Dev. MCSE Median [95% Cred. Interval] | |

mpg | .1706509 .0553838 .001794 .1689529 .0700874 .2867964 | |

_cons | -4.63653 1.274421 .041146 -4.564754 -7.247727 -2.325101 | |

The output is almost identical to that produced for built-in models. The only
differences are in the title and model summary. The generic 'Bayesian
regression' title is used, which can be changed via option **title()**. The
model summary displays the name of the program used to compute the likelihood
of this model. The results are the same as the ones that would have been
obtained using the built-in **likelihood(logit)** model (given the same
random-number seed and initial values).

We can choose a different built-in prior, for example, a normal prior with zero mean and variance of 25, again for both coefficients.

. bayesmh foreign mpg, llevaluator(logitll) prior({foreign:}, normal(0,25))Burn-in ... Simulation ... Model summary

Likelihood: | ||

foreign ~ logitll(xb_foreign) | ||

Prior | ||

{foreign:mpg _cons} ~ normal(0,25) (1) | ||

(1) Parameters are elements of the linear form xb_foreign. | ||

Bayesian regression MCMC iterations = 12,500 | ||

Random-walk Metropolis-Hastings sampling Burn-in = 2,500 | ||

MCMC sample size = 10,000 | ||

Number of obs = 74 | ||

Acceptance rate = .2125 | ||

Efficiency: min = .09213 | ||

avg = .0953 | ||

Log marginal likelihood = -47.146104 max = .09847 | ||

Equal-tailed | ||

foreign | Mean Std. Dev. MCSE Median [95% Cred. Interval] | |

mpg | .1584296 .0498191 .001641 .1549375 .0650025 .2631174 | |

_cons | -4.337211 1.160714 .03699 -4.319106 -6.783982 -2.203245 | |

What if you need to use a prior that is not supported? For simple prior
distributions, you can specify the expression for the prior density or log
density in the corresponding **prior()**'s suboptions **density()** and
**logdensity()**. For other prior distributions, you can incorporate them
directly into the computation of the posterior density; see Log-posterior evaluators.

In Log-likelihood evaluators,
we created the **logitll** program to
compute the log likelihood for a logistic model. Under the flat prior, a
prior with the density of 1, the log posterior equals the log likelihood. So,
assuming the flat prior, we can use our log-likelihood evaluator as the
log-posterior evaluator as well.

We specify the name of the log-posterior evaluator in option **evaluator()**.
We do not specify the **prior()** option, because the prior information is
already incorporated in the computation of the log-posterior density.

. bayesmh foreign mpg, evaluator(logitll)Burn-in ... Simulation ... Model summary

Posterior: | ||

foreign ~ logitll(xb_foreign) | ||

Bayesian regression MCMC iterations = 12,500 | ||

Random-walk Metropolis-Hastings sampling Burn-in = 2,500 | ||

MCMC sample size = 10,000 | ||

Number of obs = 74 | ||

Acceptance rate = .2714 | ||

Efficiency: min = .09526 | ||

avg = .0956 | ||

Log marginal likelihood = -41.616161 max = .09594 | ||

Equal-tailed | ||

foreign | Mean Std. Dev. MCSE Median [95% Cred. Interval] | |

mpg | .1706509 .0553838 .001794 .1689529 .0700874 .2867964 | |

_cons | -4.63653 1.274421 .041146 -4.564754 -7.247727 -2.325101 | |

As expected, we obtain results identical to the log-likelihood evaluator with the flat prior (given the same random-number seed).

As a demonstration, let's now write a log-posterior evaluator for the normal-prior model from Log-likelihood evaluators.

program logit_normal version 14.1 args lnp xb // compute log likelihood tempvar lnfj quietly generate double `lnfj' = ln(invlogit( `xb')) /// if $MH_y == 1 & $MH_touse quietly replace `lnfj' = ln(invlogit(-`xb')) if $MH_y == 0 & $MH_touse quietly summarize `lnfj', meanonly tempname lnf scalar `lnf' = r(sum) if r(N) < $MH_n { scalar `lnp' = . exit } // compute log posterior scalar `lnp' = `lnf' + lnnormalden($MH_b[1,1],0,5) + /// lnnormalden($MH_b[1,2],0,5) end

We write a new program called **logit_normal**, which is a straightforward
extension of the **logitll** program. We compute the log likelihood as we did
before, but we now return the overall log posterior **lnp** instead of the
overall log likelihood **lnf**. The log posterior **lnp** is computed as the sum
of the log likelihood and the log prior distributions of the two parameters.
Global macro **MH_b** contains the name of a temporary vector of coefficients;
**$MH_b[1,1]** contains the current value of the first parameter, **mpg**;
and **$MH_b[1,2]** contains the current value of the second parameter,
**_cons**.

We specify the name of the new evaluator in option **evaluator()**.

. bayesmh foreign mpg, evaluator(logit_normal)Burn-in ... Simulation ... Model summary

Posterior: | ||

foreign ~ logitll(xb_foreign) | ||

Bayesian regression MCMC iterations = 12,500 | ||

Random-walk Metropolis-Hastings sampling Burn-in = 2,500 | ||

MCMC sample size = 10,000 | ||

Number of obs = 74 | ||

Acceptance rate = .2125 | ||

Efficiency: min = .09213 | ||

avg = .0953 | ||

Log marginal likelihood = -47.146104 max = .09847 | ||

Equal-tailed | ||

foreign | Mean Std. Dev. MCSE Median [95% Cred. Interval] | |

mpg | .1584296 .0498191 .001641 .1549375 .0650025 .2631174 | |

_cons | -4.337211 1.160714 .03699 -4.319106 -6.783982 -2.203245 | |

The results are identical to the normal-prior model in Log-likelihood evaluators.

Read the overview from the *Stata News*.

View a complete list of Bayesian analysis features.

Upgrade now Order Stata