## Stata 15 help for bayesmh

```
[BAYES] bayesmh -- Bayesian regression using Metropolis-Hastings algorithm

Syntax

Univariate linear models

bayesmh depvar [indepvars] [if] [in] [weight], likelihood(modelspec)
prior(priorspec) [reffects(varname) options]

Multivariate linear models

Multivariate normal linear regression with common regressors

bayesmh depvars = [indepvars] [if] [in] [weight],
likelihood(mvnormal(...)) prior(priorspec) [options]

Multivariate normal regression with outcome-specific regressors

bayesmh ([eqname1:]depvar1 [indepvars1])
([eqname2:]depvar2 [indepvars2]) [...] [if] [in] [weight],
likelihood(mvnormal(...)) prior(priorspec) [options]

Multiple-equation linear models

bayesmh (eqspec) [(eqspec)] [...] [if] [in] [weight],
prior(priorspec) [options]

Nonlinear models

Univariate nonlinear regression

bayesmh depvar = (subexpr) [if] [in] [weight],
likelihood(modelspec) prior(priorspec) [options]

Multivariate normal nonlinear regression

bayesmh (depvar1 = (subexpr1))
(depvar2 = (subexpr2)) [...] [if] [in] [weight],
likelihood(mvnormal(...)) prior(priorspec) [options]

Probability distributions

Univariate distributions

bayesmh depvar [if] [in] [weight], likelihood(distribution)
prior(priorspec) [options]

Multiple-equation distribution specifications

bayesmh (deqspec) [(deqspec)] [...] [if] [in] [weight],
prior(priorspec) [options]

The syntax of eqspec is

varspec [if] [in] [weight], likelihood(modelspec) [noconstant]

The syntax of varspec is one of the following:

for single outcome

[eqname:]depvar [indepvars]

for multiple outcomes with common regressors

depvars = [indepvars]

for multiple outcomes with outcome-specific regressors

([eqname1:]depvar1 [indepvars1]) ([eqname2:]depvar2
[indepvars2]) [...]

The syntax of deqspec is

[eqname:] depvar [if] [in] [weight], likelihood(distribution)

subexpr, subexpr1, subexpr2, and so on are substitutable expressions; see
Substitutable expressions for details.

The syntax of modelspec is

model [, modelopts]

model                   Description
-------------------------------------------------------------------------
Model
normal(var)           normal regression with variance var
t(sigma2, df)         t regression with squared scale sigma2 and
degrees of freedom df
lognormal(var)        lognormal regression with variance var
lnormal(var)          synonym for lognormal()
exponential           exponential regression
mvnormal(Sigma)       multivariate normal regression with covariance
matrix Sigma

probit                probit regression
logit                 logistic regression
logistic              logistic regression; synonym for logit
binomial(n)           binomial regression with logit link and number of
trials n
binlogit(n)           synonym for binomial()
oprobit               ordered probit regression
ologit                ordered logistic regression
poisson               Poisson regression

llf(subexpr)          substitutable expression for observation-level
log-likelihood function
-------------------------------------------------------------------------
A distribution argument is a number for scalar arguments such as var; a
variable name, varname (except for matrix arguments); a matrix for
matrix arguments such as Sigma; a model parameter, paramspec; an
expression, expr; or a substitutable expression, subexpr.  See
Specifying arguments of likelihood models and prior distributions.

modelopts               Description
-------------------------------------------------------------------------
Model
offset(varname_o)     include varname_o in model with coefficient
constrained to 1; not allowed with normal() and
mvnormal()
exposure(varname_e)   include ln(varname_e) in model with coefficient
constrained to 1; allowed only with poisson
-------------------------------------------------------------------------

distribution            Description
-------------------------------------------------------------------------
Model
dexponential(beta)    exponential distribution with scale parameter
beta
dbernoulli(p)         Bernoulli distribution with success probability p
dbinomial(p,n)        binomial distribution with success probability p
and number of trials n
dpoisson(mu)          Poisson distribution with mean mu
-------------------------------------------------------------------------
A distribution argument is a model parameter, paramspec, or a
substitutable expression, subexpr, containing model parameters.  An n
argument may be a number; an expression, expr; or a variable name,
varname.  See Specifying arguments of likelihood models and prior
distributions.

The syntax of priorspec is

paramref, priordist

where the simplest specification of paramref is

paramspec [paramspec] [...]]

Also see Referring to model parameters for other specifications.

The syntax of paramspec is

{[eqname:]param[, matrix]}

where the parameter label eqname and parameter name param are valid Stata
names.  Model parameters are either scalars such as {var}, {mean},
{scale:beta}, or matrices such as {Sigma, matrix} and {Scale:V,
matrix}.  For scalar parameters, you can use {param=#} to specify an
initial value.  For example, you can specify, {var=1}, {mean=1.267},
or {shape:alpha=3}.

priordist                     Description
-------------------------------------------------------------------------
Model
normal(mu,var)              normal with mean mu and variance var
t(mu,sigma2,df)             location-scale t with mean mu, squared
scale sigma2, and degrees of freedom df
lognormal(mu,var)           lognormal with mean mu and variance var
lnormal(mu,var)             synonym for lognormal()
uniform(a,b)                uniform on (a,b)
gamma(alpha,beta)           gamma with shape alpha and scale beta
igamma(alpha,beta)          inverse gamma with shape alpha and scale
beta
exponential(beta)           exponential with scale beta
laplace(mu,beta)            Laplace with mean mu and scale beta
cauchy(loc,beta)            Cauchy with location loc and scale beta
beta(a,b)                   beta with shape parameters a and b
chi2(df)                    central chi-squared with degrees of freedom
df
jeffreys                    Jeffreys prior for variance of a normal
distribution

mvnormal(d,mean,Sigma)      multivariate normal of dimension d with
mean vector mean and covariance matrix
Sigma; mean can be a matrix name or a
list of d means separated by comma: mu1,
mu2, ..., mud
mvnormal0(d,Sigma)          multivariate normal of dimension d with
zero mean vector and covariance matrix
Sigma
mvn0(d,Sigma)               synonym for mvnormal0()
zellnersg(d,g,mean,{var})   Zellner's g-prior of dimension d with g
degrees of freedom, mean vector mean, and
variance parameter {var}; mean can be a
matrix name or a list of d means
separated by comma: mu1, mu2, ..., mud
zellnersg0(d,g,{var})       Zellner's g-prior of dimension d with g
degrees of freedom, zero mean vector, and
variance parameter {var}
wishart(d,df,V)             Wishart of dimension d with degrees of
freedom df and scale matrix V
iwishart(d,df,V)            inverse Wishart of dimension d with degrees
of freedom df and scale matrix V
jeffreys(d)                 Jeffreys prior for covariance of a
multivariate normal distribution of
dimension d

bernoulli(p)                Bernoulli with success probability p
index(p1,...,pk)            discrete indices 1, 2, ..., k with
probabilities p1, p2, ..., pk
poisson(mu)                 Poisson with mean mu

flat                        flat prior; equivalent to density(1) or
logdensity(0)
density(f)                  generic density f
logdensity(logf)            generic logdensity logf
-------------------------------------------------------------------------
Dimension d is a positive #.
A distribution argument is a number for scalar arguments such as var,
alpha, beta; a Stata matrix for matrix arguments such as Sigma and V;
a model parameter, paramspec; an expression, expr; or a substitutable
expression subexpr.  See Specifying arguments of likelihood models
and prior distributions.
f is a nonnegative number, #; an expression expr; or a substitutable
expression, subexpr.
logf is a number, #; an expression, expr; or a substitutable expression,
subexpr.
When mvnormal() or mvnormal0() of dimension d is applied to paramref with
n parameters (n!=d), paramref is reshaped into a matrix with d
columns, and its rows are treated as independent samples from the
specified mvnormal() distribution. If such reshaping is not possible,
an error is issued.  See example 25 for application of this feature.

options                         Description
-------------------------------------------------------------------------
Model
noconstant                    suppress constant term; not allowed with
ordered models, nonlinear models, and
probability distributions
* likelihood(lspec)             distribution for the likelihood model
* prior(priorspec)              prior for model parameters; this option
may be repeated
dryrun                        show model summary without estimation

Model 2
redefine(label:i.varname)     specify a random-effects linear form;
this option may be repeated
xbdefine(label:varlist)       specify a linear form

Simulation

mcmcsize(#)                   MCMC sample size; default is
mcmcsize(10000)
burnin(#)                     burn-in period; default is burnin(2500)
thinning(#)                   thinning interval; default is thinning(1)
rseed(#)                      random-number seed
exclude(paramref)             specify model parameters to be excluded
from the simulation results

Blocking

block(paramref[, blockopts])  specify a block of model parameters; this
option may be repeated
blocksummary                  display block summary

Initialization

initial(initspec)             initial values for model parameters
nomleinitial                  suppress the use of maximum likelihood
estimates as starting values
initrandom                    specify random initial values
initsummary                   display initial values used for
simulation

scale(#)                      initial multiplier for scale factor;
default is scale(2.38)
covariance(cov)               initial proposal covariance; default is
the identity matrix

Reporting

clevel(#)                     set credible interval level; default is
clevel(95)
hpd                           display HPD credible intervals instead of
the default equal-tailed credible
intervals
eform[(string)]               report exponentiated coefficients and,
optionally, label as string
batch(#)                      specify length of block for batch-means
calculations; default is batch(0)
saving(filename, replace)     save simulation results to filename.dta
nomodelsummary                suppress model summary
noexpression                  suppress output of expressions from model
summary
[no]dots                      suppress dots or display dots every 100
iterations and iteration numbers every
1,000 iterations; default is nodots
dots(#[, every(#)])           display dots as simulation is performed
[no]show(paramref)            specify model parameters to be excluded
from or included in the output
showreffects[(reref)]         specify that all or a subset of
random-effects parameters be included
in the output
notable                       suppress estimation table
title(string)                 display string as title above the table
of parameter estimates
display_options               control spacing, line width, and base and
empty cells

search(search_options)        control the search for feasible initial
values
corrlag(#)                    specify maximum autocorrelation lag;
default varies
corrtol(#)                    specify autocorrelation tolerance;
default is corrtol(0.01)
-------------------------------------------------------------------------
* Options likelihood() and prior() are required.  prior() must be
specified for all model parameters.
Options prior(), redefine(), and block() can be repeated.
indepvars and paramref may contain factor variables; see fvvarlist.
With multiple-equations specifications, a local if specified within an
equation is applied together with the global if specified with the
command.
Only fweights are allowed; see weight.
With multiple-equations specifications, local weights or (weights
specified within an equation) override global weights (weights
specified with the command).
See [BAYES] bayesian postestimation for features available after
estimation.

blockopts                      Description
-------------------------------------------------------------------------
gibbs                           requests Gibbs sampling; available for
selected models only and not allowed
with scale(), covariance(), or
split                           requests that all parameters in a block
be treated as separate blocks
reffects                        requests that all parameters in a block
be treated as random-effects parameters
scale(#)                        initial multiplier for scale factor for
current block; default is scale(2.38);
not allowed with gibbs
covariance(cov)                 initial proposal covariance for the
current block; default is the identity
matrix; not allowed with gibbs
the current block; not allowed with
gibbs
-------------------------------------------------------------------------
Only tarate() and tolerance() may be specified in the adaptation()
option.

-------------------------------------------------------------------------
every(100)
maxiter(#)                    maximum number of adaptation loops;
default is maxiter(25) or
max{25,floor(burnin()/every())}
whenever default values of these
options are modified
miniter(#)                    minimum number of adaptation loops;
default is miniter(5)
alpha(#)                      parameter controlling acceptance rate
(AR); default is alpha(0.75)
beta(#)                       parameter controlling proposal
covariance; default is beta(0.8)
default is gamma(0)
* tarate(#)                     target acceptance rate (TAR); default is
parameter specific
* tolerance(#)                  tolerance for AR; default is
tolerance(0.01)
-------------------------------------------------------------------------
* Only starred options may be specified in the adaptation() option
specified within block().

Statistics > Bayesian analysis > General estimation and regression

Description

bayesmh fits a variety of Bayesian models using an adaptive
Metropolis-Hastings (MH) algorithm.  It provides various likelihood
models and prior distributions for you to choose from.  Likelihood models
include univariate normal linear and nonlinear regressions, multivariate
normal linear and nonlinear regressions, generalized linear models such
as logit and Poisson regressions, and multiple-equations linear models.
Prior distributions include continuous distributions such as uniform,
Jeffreys, normal, gamma, multivariate normal, and Wishart and discrete
distributions such as Bernoulli and Poisson.  You can also program your
own Bayesian models; see [BAYES] bayesmh evaluators.

Also see [BAYES] bayesian estimation for a list of Bayesian regression
models that can be fit more conveniently with the bayes prefix ([BAYES]
bayes).

Options

+-------+
----+ Model +------------------------------------------------------------

noconstant suppresses the constant term (intercept) from the regression
model. By default, bayesmh automatically includes a model parameter
{depname:_cons} in all regression models except ordered and nonlinear
models. Excluding the constant term may be desirable when there is a
factor variable, the base level of which absorbs the constant term in
the linear combination.

likelihood(lspec) specifies the distribution of the data.  This option
specifies the likelihood portion of the Bayesian model. This option
is required.  lspec is one of modelspec or distribution.

modelspec specifies one of the supported likelihood distributions for
regression models.  A location parameter of these distributions is
automatically parameterized as a linear combination of the specified
independent variables and needs not be specified.  Other parameters
may be specified as arguments to the distribution separated by
commas. Each argument may be a real number (#), a variable name
(except for matrix parameters), a predefined matrix, a model
parameter specified in { }, a Stata expression, or a substitutable
expression containing model parameters; see Declaring model
parameters and Specifying arguments of likelihood models and prior
distributions in [BAYES] bayesmh.

distribution specifies one of the supported distributions for
modeling the dependent variable.  A distribution argument must be a
model parameter specified in { } or a substitutable expression
containing model parameters; see Declaring model parameters and
Specifying arguments of likelihood models and prior distributions in
[BAYES] bayesmh.  A number of trials, n, of the binomial distribution
may be a real number (#), a Stata expression, or a variable name.
For an example of modeling outcome distributions directly, see
Beta-binomial model in [BAYES] bayesmh.

For some regression models, option likelihood() provides suboptions
subopts in likelihood(..., subopts).  subopts is offset() and
exposure().

offset(varname_o) specifies that varname_o be included in the
regression model with the coefficient constrained to be 1.  This
option is available with probit, logit, binomial(), binlogit(),
oprobit, ologit, and poisson.

exposure(varname_e) specifies a variable that reflects the amount of
exposure over which the depvar events were observed for each
observation; ln(varname_e) with coefficient constrained to be 1
is entered into the log-link function. This option is available
with poisson.

prior(priorspec) specifies a prior distribution for model parameters.
This option is required and may be repeated.  A prior must be
specified for each model parameter.  Model parameters may be scalars
or matrices, but both types may not be combined in one prior
statement.  If multiple scalar parameters are assigned a single
univariate prior, they are considered independent, and the specified
prior is used for each parameter.  You may assign a multivariate
prior of dimension d to d scalar parameters.  Also see Referring to
model parameters below and Specifying arguments of likelihood models
and prior distributions in [BAYES] bayesmh.

All likelihood() and prior() combinations are allowed, but they are not
guaranteed to correspond to proper posterior distributions.  You need to
think carefully about the model you are building and evaluate its
convergence thoroughly.

dryrun specifies to show the summary of the model that would be fit
without actually fitting the model.  This option is recommended for
checking specifications of the model before fitting the model.  The
model summary reports the information about the likelihood model and
about priors for all model parameters.

+---------+
----+ Model 2 +----------------------------------------------------------

reffects(varname) specifies a random-effects variable, a variable
identifying the group structure for the random effects, with
univariate linear models.  This option is useful for fitting
two-level random-intercept models.  A random-effects variable is
treated as a factor variable with no base level.  As such, you can
refer to random-effects parameters or, simply, random effects
associated with varname using a conventional factor-variable
notation.  For example, you can use {depvar:i.varname} to refer to
all random-effects parameters of varname.  These parameters must be
included in a single prior statement, usually a normal distribution
with variance specified by an additional parameter.  The
random-effects parameters are assumed to be conditionally independent
across levels of varname given all other model parameters.  The
random-effects parameters are automatically grouped in one block and
are thus not allowed in the block() option.  See example 23.

redefine(label:i.varname) specifies a random-effects linear form that can
be used in substitutable expressions.  You can use {label:} to refer
to the linear form in substitutable expressions.  You can specify
{label:i.varname} to refer to all random-effects parameters
associated with varname.  The random-effects parameters are
automatically grouped in one block and are thus not allowed in the
block() option.  This option is useful for fitting multilevel models
and can be repeated.  See example 29.

xbdefine(label:varlist) specifies a linear form of the variables in
varlist that can be used in substitutable expressions.  You can use
the specification {label:} to refer to the linear form in
substitutable expressions.  For any varname in varlist, you can use
{label:varname} to refer to the corresponding parameter. This option
is useful with nonlinear specifications when the linear form contains
many variables and provides more efficient computation in such cases.

+------------+
----+ Simulation +-------------------------------------------------------

mcmcsize(#) specifies the target MCMC sample size.  The default MCMC
sample size is mcmcsize(10000).  The total number of iterations for
the MH algorithm equals the sum of the burn-in iterations and the
MCMC sample size in the absence of thinning.  If thinning is present,
the total number of MCMC iterations is computed as burnin() +
(mcmcsize() - 1) x thinning() + 1.  Computation time of the MH
algorithm is proportional to the total number of iterations.  The
MCMC sample size determines the precision of posterior summaries,
which may be different for different model parameters and will depend
on the efficiency of the Markov chain. Also see Burn-in period and
MCMC sample size in [BAYES] bayesmh.

burnin(#) specifies the number of iterations for the burn-in period of
MCMC.  The values of parameters simulated during burn-in are used for
adaptation purposes only and are not used for estimation.  The
default is burnin(2500).  Typically, burn-in is chosen to be as long
as or longer than the adaptation period. Also see Burn-in period and
MCMC sample size and Convergence of MCMC in [BAYES] bayesmh.

thinning(#) specifies the thinning interval.  Only simulated values from
every (1+k x #)th iteration for k = 0, 1, 2, ... are saved in the
final MCMC sample; all other simulated values are discarded. The
default is thinning(1); that is, all simulation values are saved.
Thinning greater than one is typically used for decreasing the
autocorrelation of the simulated MCMC sample.

rseed(#) sets the random-number seed.  This option can be used to
reproduce results.  rseed(#) is equivalent to typing set seed # prior
to calling bayesmh; see [R] set seed and Reproducing results in
[BAYES] bayesmh.

exclude(paramref) specifies which model parameters should be excluded
from the final MCMC sample.  These model parameters will not appear
in the estimation table, and postestimation features for these
parameters and log marginal likelihood will not be available.  This
option is useful for suppressing nuisance model parameters.  For
example, if you have a factor predictor variable with many levels but
you are only interested in the variability of the coefficients
associated with its levels, not their actual values, then you may
wish to exclude this factor variable from the simulation results.  If
you simply want to omit some model parameters from the output, see
the noshow() option.  paramref can include individual random-effects
parameters.

+----------+
----+ Blocking +---------------------------------------------------------

block(paramref[, blockopts]) specifies a group of model parameters for
the blocked MH algorithm.  By default, all parameters except matrices
are treated as one block, and each matrix parameter is viewed as a
separate block.  You can use the block() option to separate scalar
parameters in multiple blocks.  Technically, you can also use block()
to combine matrix parameters in one block, but this is not
recommended.  The block() option may be repeated to define multiple
blocks.  Different types of model parameters, such as scalars and
matrices, may not be specified in one block().  Parameters within one
block are updated simultaneously, and each block of parameters is
updated in the order it is specified; the first specified block is
updated first, the second is updated second, and so on. See Improving
efficiency of the MH algorithm---blocking of parameters in [BAYES]
bayesmh.

blockopts include gibbs, split, reffects, scale(), covariance(), and

gibbs specifies to use Gibbs sampling to update parameters in the
block.  This option is allowed only for specific combinations of
likelihood models and prior distributions; see Gibbs sampling for
some likelihood-prior and prior-hyperprior configurations in
sampling in [BAYES] bayesmh.  gibbs may not be combined with

split specifies that all parameters in a block are treated as
separate blocks.  This may be useful for levels of factor
variables.

reffects specifies that the parameters associated with the levels of
a factor variable included in the likelihood specification be
treated as random-effects parameters.  Random-effects parameters
must be included in one prior statement and are assumed to be
conditionally independent across levels of a grouping variable
given all other model parameters.  reffects requires that
parameters be specified as {depvar:i.varname}, where i.varname is
the corresponding factor variable in the likelihood
specification, and may not be combined with block()'s suboptions
gibbs and split.  This option is useful for fitting hierarchical
or multilevel models.  See example 25 in [BAYES] bayesmh for
details.

scale(#) specifies an initial multiplier for the scale factor
corresponding to the specified block.  The initial scale factor
is computed as #/sqrt{n_p} for continuous parameters and as #/n_p
for discrete parameters, where n_p is the number of parameters in
the block. The default is scale(2.38).  If specified, this option
overrides the respective setting from the scale() option
specified with the command.  scale() may not be combined with
gibbs.

covariance(matname) specifies a scale matrix matname to be used to
compute an initial proposal covariance matrix corresponding to
the specified block.  The initial proposal covariance is computed
as rho x Sigma, where rho is a scale factor and Sigma = matname.
By default, Sigma is the identity matrix.  If specified, this
option overrides the respective setting from the covariance()
option specified with the command.  covariance() may not be
combined with gibbs.

block-specific TAR and acceptance tolerance. If specified, they
override the respective settings from the adaptation() option
specified with the command.  adaptation() may not be combined
with gibbs.

blocksummary displays the summary of the specified blocks.  This option
is useful when block() is specified.

+----------------+
----+ Initialization +---------------------------------------------------

initial(initspec) specifies initial values for the model parameters to be
used in the simulation.  You can specify a parameter name, its
initial value, another parameter name, its initial value, and so on.
For example, to initialize a scalar parameter alpha to 0.5 and a 2x2
matrix Sigma to the identity matrix I(2), you can type

bayesmh ..., initial({alpha} 0.5 {Sigma,m} I(2)) ...

You can also specify a list of parameters using any of the
specifications described in Referring to model parameters in [BAYES]
bayesmh.  For example, to initialize all regression coefficients from
equations y1 and y2 to zero, you can type

bayesmh ..., initial({y1:} {y2:} 0) ...

The general specification of initspec is

paramref # [paramref # []]

Curly braces may be omitted for scalar parameters but must be
specified for matrix parameters.  Initial values declared using this
option override the default initial values or any initial values
declared during parameter specification in the likelihood() option.
See Specifying initial values in [BAYES] bayesmh for details.

nomleinitial suppresses using maximum likelihood estimates (MLEs)
starting values for model parameters.  By default, when no initial
values are specified, MLE values (when available) are used as initial
values.  If nomleinitial is specified and no initial values are
provided, the command uses ones for positive scalar parameters, zeros
for other scalar parameters, and identity matrices for matrix
parameters.  nomleinitial may be useful for providing an alternative
starting state when checking convergence of MCMC.  This option cannot
be combined with initrandom.

initrandom requests that the model parameters be initialized randomly.
Random initial values are generated from the prior distributions of
the model parameters.  If you want to use fixed initial values for
some of the parameters, you can specify them in the initial() option
or during parameter declarations in the likelihood() option.  Random
initial values are not available for parameters with flat, density(),
logdensity(), and jeffreys() priors; you must provide fixed initial
values for such parameters.  This option cannot be combined with
nomleinitial.

initsummary specifies that the initial values used for simulation be
displayed.

+------------+

Adaptation takes place every prespecified number of MCMC iterations
and consists of tuning the proposal scale factor and proposal
covariance for each block of model parameters.  Adaptation is used to
improve sampling efficiency.  Provided defaults are based on
theoretical results and may not be sufficient for all applications.
See Adaptation of the MH algorithm in [BAYES] bayesmh for details

adaptopts are any of the following options:

every(#) specifies that adaptation be attempted every #th iteration.
The default is every(100).  To determine the adaptation interval,
you need to consider the maximum block size specified in your
model.  The update of a block with k model parameters requires
the estimation of a k x k covariance matrix.  If the adaptation
interval is not sufficient for estimating the k(k+1)/2 elements
of this matrix, the adaptation may be insufficient.

maxiter(#) specifies the maximum number of adaptive iterations.
Adaptation includes tuning of the proposal covariance and of the
scale factor for each block of model parameters.  Once the TAR is
achieved within the specified tolerance, the adaptation stops.
However, no more than # adaptation steps will be performed.  The
default is variable and is computed as

maxiter() is usually chosen to be no greater than

miniter(#) specifies the minimum number of adaptive iterations to be
performed regardless of whether the TAR has been achieved.  The
default is miniter(5).  If the specified miniter() is greater
than maxiter(), then miniter() is reset to maxiter().  Thus, if
you specify maxiter(0), then no adaptation will be performed.

alpha(#) specifies a parameter controlling the adaptation of the AR.
alpha() should be in [0,1].  The default is alpha(0.75).

beta(#) specifies a parameter controlling the adaptation of the
proposal covariance matrix.  beta() must be in [0,1].  The closer
beta() is to zero, the less adaptive the proposal covariance.
When beta() is zero, the same proposal covariance will be used in
all MCMC iterations.  The default is beta(0.8).

gamma(#) specifies a parameter controlling the adaptation rate of the
proposal covariance matrix.  gamma() must be in [0,1].  The
larger the value of gamma(), the less adaptive the proposal
covariance.  The default is gamma(0).

tarate(#) specifies the TAR for all blocks of model parameters; this
is rarely used.  tarate() must be in (0,1).  The default AR is
0.234 for blocks containing continuous multiple parameters, 0.44
for blocks with one continuous parameter, and 1/n_maxlev for
blocks with discrete parameters, where n_maxlev is the maximum
number of levels for a discrete parameter in the block.

tolerance(#) specifies the tolerance criterion for adaptation based
on the TAR.  tolerance() should be in (0,1).  Adaptation stops
whenever the absolute difference between the current AR and TAR
is less than tolerance().  The default is tolerance(0.01).

scale(#) specifies an initial multiplier for the scale factor for all
blocks.  The initial scale factor is computed as #/sqrt{n_p} for
continuous parameters and #/n_p for discrete parameters, where n_p is
the number of parameters in the block.  The default is scale(2.38).

covariance(cov) specifies a scale matrix cov to be used to compute an
initial proposal covariance matrix.  The initial proposal covariance
is computed as rho x Sigma, where rho is a scale factor and Sigma =
matname.  By default, Sigma is the identity matrix.  Partial
specification of Sigma is also allowed.  The rows and columns of cov
should be named after some or all model parameters.  According to
some theoretical results, the optimal proposal covariance is the
posterior covariance matrix of model parameters, which is usually
unknown.  This option does not apply to the blocks containing
random-effects parameters.

+-----------+
----+ Reporting +--------------------------------------------------------

clevel(#) specifies the credible level, as a percentage, for equal-tailed
and HPD credible intervals.  The default is clevel(95) or as set by
[BAYES] set clevel.

hpd specifies the display of HPD credible intervals instead of the
default equal-tailed credible intervals.

eform and eform(string) specify that the coefficient table be displayed
in exponentiated form and that exp(b) and string, respectively, be
used to label the exponentiated coefficients in the table.

batch(#) specifies the length of the block for calculating batch means,
batch standard deviation, and MCSE using batch means.  The default is
batch(0), which means no batch calculations.  When batch() is not
specified, MCSE is computed using effective sample sizes instead of
batch means.  Option batch() may not be combined with corrlag() or
corrtol().

saving(filename[, replace]) saves simulation results in filename.dta.
The replace option specifies to overwrite filename.dta if it exists.
If the saving() option is not specified, bayesmh saves simulation
results in a temporary file for later access by postestimation
commands.  This temporary file will be overridden every time bayesmh
is run and will also be erased if the current estimation results are
cleared. saving() may be specified during estimation or on replay.

The saved dataset has the following structure. Variance _index
records iteration numbers.  bayesmh saves only states (sets of
parameter values) that are different from one iteration to another
and the frequency of each state in variable _frequency. (Some states
may be repeated for discrete parameters.) As such, _index may not
necessarily contain consecutive integers. Remember to use _frequency
as a frequency weight if you need to obtain any summaries of this
dataset.  Values for each parameter are saved in a separate variable
in the dataset.  Variables containing values of parameters without
equation names are named as eq0_p#, following the order in which
parameters are declared in bayesmh.  Variables containing values of
parameters with equation names are named as eq#_p#, again following
the order in which parameters are defined.  Parameters with the same
equation names will have the same variable prefix eq#.  For example,

. bayesmh y x1, likelihood(normal({var})) saving(mcmc) ...

will create a dataset, mcmc.dta, with variable names eq1_p1 for
{y:x1}, eq1_p2 for {y:_cons}, and eq0_p1 for {var}.  Also see macros
e(parnames) and e(varnames) for the correspondence between parameter
names and variable names.

In addition, bayesmh saves variable _loglikelihood to contain values
of the log likelihood from each iteration and variable _logposterior
to contain values of the log posterior from each iteration.

nomodelsummary suppresses the detailed summary of the specified model.
The model summary is reported by default.

noexpression suppresses the output of expressions from the model summary.
Expressions (when specified) are reported by default.

nodots, dots, and dots(#) specify to suppress or display dots during
simulation.  dots(#) displays a dot every # iterations.  During the
adaptation period, a symbol a is displayed instead of a dot. If
dots(..., every(#)) is specified, then an iteration number is
displayed every #th iteration instead of a dot or a. dots(, every(#))
is equivalent to dots(1, every(#)).  dots displays dots every 100
iterations and iteration numbers every 1,000 iterations; it is a
synonym for dots(100), every(1000). By default, no dots are displayed
(nodots or dots(0)).

show(paramref) or noshow(paramref) specifies a list of model parameters
to be included in the output or excluded from the output,
respectively.  By default, all model parameters (except
random-effects parameters when reffects() is specified) are
displayed.  Do not confuse noshow() with exclude(), which excludes
the specified parameters from the MCMC sample.  When the noshow()
option is specified, for computational efficiency, MCMC summaries of
the specified parameters are not computed or stored in e().  paramref
can include individual random-effects parameters.

showreffects and showreffects(reref) are used with option reffects() and
specify that all or a list reref of random-effects parameters be
included in the output in addition to other model parameters.  By
default, all random-effects parameters introduced by reffects() are
excluded from the output as if you have specified the noshow()
option.  This option computes, displays, and stores in e() MCMC
summaries for the first #_matsize-#_npar random-effects parameters,
where #_matsize is the maximum number of variables as determined by
matsize (see [R] matsize) and #_npar is the number of other model
parameters displayed.  If you want to obtain MCMC summaries and
display other random-effects parameters, you can use the show()
option or use bayesstats summary (see [BAYES] bayesstats summary).

notable suppresses the estimation table from the output.  By default, a
summary table is displayed containing all model parameters except
those listed in the exclude() and noshow() options.  Regression model
parameters are grouped by equation names.  The table includes six
columns and reports the following statistics using the MCMC
simulation results:  posterior mean, posterior standard deviation,
MCMC standard error or MCSE, posterior median, and credible
intervals.

replay.

title(string) specifies an optional title for the command that is
displayed above the table of the parameter estimates.  The default
title is specific to the specified likelihood model.

display_options:  vsquish, noemptycells, baselevels, allbaselevels,
nofvlabel, fvwrap(#), fvwrapon(style), and nolstretch; see [R]
estimation options.

+----------+

search(search_options) searches for feasible initial values.
search_options are on, repeat(#), and off.

search(on) is equivalent to search(repeat(500)).  This is the
default.

search(repeat(k)), k>0, specifies the number of random attempts to be
made to find a feasible initial-value vector, or initial state.
The default is repeat(500).  An initial-value vector is feasible
if it corresponds to a state with positive posterior probability.
error will be issued.  repeat(0) (rarely used) specifies that no
random attempts be made to find a feasible starting point.  In
this case, if the specified initial vector does not correspond to
a feasible state, an error will be issued.

search(off) prevents the command from searching for feasible initial
values.  We do not recommend specifying this option.

corrlag(#) specifies the maximum autocorrelation lag used for calculating
effective sample sizes.  The default is min{500,mcmcsize()/2}.  The
total autocorrelation is computed as the sum of all lag-k
autocorrelation values for k from 0 to either corrlag() or the index
at which the autocorrelation becomes less than corrtol() if the
latter is less than corrlag().  Options corrlag() and batch() may not
be combined.

corrtol(#) specifies the autocorrelation tolerance used for calculating
effective sample sizes.  The default is corrtol(0.01). For a given
model parameter, if the absolute value of the lag-k autocorrelation
is less than corrtol(), then all autocorrelation lags beyond the kth
lag are discarded.  Options corrtol() and batch() may not be
combined.

Remarks

Remarks are presented under the following headings:

Using bayesmh
Declaring model parameters
Referring to model parameters
Substitutable expressions

Using bayesmh

The bayesmh command for Bayesian analysis includes three functional
components: setting up a posterior model, performing MCMC simulation, and
summarizing and reporting results.  The first component, the
model-building step, requires some experience in the practice of Bayesian
statistics and, as any modeling task, is probably the most demanding.
You should specify a posterior model that is statistically correct and
that represents the observed data.  Another important aspect is the
computational feasibility of the model in the context of the MH MCMC
procedure implemented in bayesmh.  The provided MH algorithm is adaptive
and, to a degree, can accommodate various statistical models and data
structures.  However, careful model parametrization and well-specified
initial values and MCMC sampling scheme are crucial for achieving a
fast-converging Markov chain and consequently good results.  Simulation
of MCMC must be followed by a thorough investigation of the convergence
of the MCMC algorithm.  Once you are satisfied with the convergence of
the simulated chains, you may proceed with posterior summaries of the
results and their interpretation.  Below we discuss the three major steps
of using bayesmh and provide recommendations.

Declaring model parameters

Model parameters are typically declared, meaning first introduced, in the
arguments of distributions specified in options likelihood() and prior().
We will refer to model parameters that are declared in the prior
distributions (and not the likelihood distributions) as hyperparameters.
Model parameters may also be declared within the parameter specification
of the prior() option, but this is more rare.

bayesmh distinguishes between two types of model parameters: scalar and
matrix.  All parameters must be specified in curly braces, { }.  There
are two ways for declaring a scalar parameter:  param and {eqname:param},
where param and eqname are valid Stata names.

The specification of a matrix parameter is similar, but you must use the
matrix suboptions:  {param, matrix} and {eqname:param, matrix}.  The most
common application of matrix model parameters is for specifying the
variance-covariance matrix of a multivariate normal distribution.

All matrices are assumed to be symmetric and only the elements in the
lower diagonal are reported in the output.  Only a few multivariate prior
distributions are available for matrix parameters: wishart(), iwishart(),
and jeffreys().  In addition to being symmetric, these distributions
require that the matrices be positive definite.

It is your responsibility to declare all parameters of your model, except
regression coefficients in linear models.  For a linear model, bayesmh
automatically creates a regression coefficient with the name
{depvar:indepvar} for each independent variable indepvar in the model
and, if noconstant is not specified, an intercept parameter {depvar}. In
the presence of factor variables, bayesmh will create a parameter
{depvar:level} for each level indicator level and a parameter
{depvar:inter} for each interaction indicator inter; see fvvarlists. (It
is still your responsibility, however, to specify prior distributions for
the regression parameters.)

For example,

. bayesmh y x, ...

will automatically have two regression parameters: {y:x} and {y}, whereas

. bayesmh y x, noconstant ...

will have only one: {y:x}.

For a univariate normal linear regression, we may want to additionally
declare the scalar variance parameter by

. bayesmh y x, likelihood(normal({sig2})) ...

We can label the variance parameter, as follows:

. bayesmh y x, likelihood(normal({var:sig2})) ...

We can declare a hyperparameter for {sig2} using

. bayesmh y x, likelihood(normal({sig2})) prior({sig2},
igamma({df},2)) ...

where the hyperparameter {df} is declared in the inverse-gamma prior
distribution for {sig2}.

For a multivariate normal linear regression, in addition to four
regression parameters declared automatically by bayesmh: {y1:x}, {y1},
{y2:x}, and {y2}, we may also declare a parameter for the
variance-covariance matrix:

. bayesmh y1 y2 = x, likelihood(mvnormal({Sigma, matrix})) ...

or abbreviate matrix to m for short:

. bayesmh y1 y2 = x, likelihood(mvnormal({Sigma, m})) ...

Referring to model parameters

After a model parameter is declared, we may need to refer to it in our
further model specification.  We will definitely need to refer to it when
we specify its prior distribution.  We may also need to use it as an
argument in the prior distributions of other parameters or need to
specify it in the block() option for blocking of model parameters; see
Improving efficiency of the MH algorithm---blocking of parameters in
[BAYES] bayesmh.

To refer to one parameter, we simply use its definition:  {param},
{eqname:param}, {param, matrix}, or {eqname:param, matrix}.  There are
several ways in which you can refer to multiple parameters.  You can
refer to multiple model parameters in the parameter specification
paramref of the prior(paramref, ...) option, of the block(paramref, ...)
option, or of the initial(paramref #) option.

The most straightforward way to refer to multiple scalar model parameters
is to simply list them individually, as follows:

{param1} {param2} ...

but there are shortcuts.

For example, the alternative to the above is

{param1 param2} ...

where we simply list the names of all parameters inside one set of curly
braces.

If parameters have the same equation name, you can refer to all of the
parameters with that equation name as follows.  Suppose that we have
three parameters with the same equation name eqname, then the
specification

{eqname:param1} {eqname:param2} {eqname:param3}

is the same as the specification

{eqname:}

or the specification

{eqname:param1 param2 param3}

The above specification is useful if we want to refer to a subset of
parameters with the same equation name. For example, in the above, if we
wanted to refer to only param1 and param2, we could type

{eqname:param1 param2}

If a factor variable is used in the specification of the regression
function, you can use the same factor-variable specification within
paramref to refer to the coefficients associated with the levels of that
factor variable; see fvvarlists.

For example, factor variables are useful for constructing multilevel
Bayesian models.  Suppose that variable id defines the second level of
hierarchy in a two-level random-effects model.  We can fit a Bayesian
random-intercept model as follows.

. bayesmh y x i.id, likelihood(normal({var})) prior({y:i.id},
normal(0,{tau})) ...

Here we used {y:i.id} in the prior specification to refer to all levels
of id.

Similarly, we can add a random coefficient for a continuous covariate x
by typing

. bayesmh y c.x##i.id, likelihood(normal({var})) prior({y:i.id},
normal(0,{tau1})) prior({y:c.x#i.id}, normal(0,{tau2})) ...

You can mix and match all of the specifications above in one parameter
specification, paramref.

To refer to multiple matrix model parameters, you can use {paramlist,
matrix} to refer to matrix parameters with names paramlist and
{eqname:paramlist, matrix} to refer to matrix parameters with names in
paramlist and with equation name eqname.

For example, the specification

{eqname:Sigma1,m} {eqname:Sigma2,m} {Sigma3,m} {Sigma4,m}

is the same as the specification

{eqname:Sigma1 Sigma2,m} {Sigma3 Sigma4,m}

You cannot refer to both scalar and matrix parameters in one paramref
specification.

For referring to model parameters in postestimation commands, see
Different ways of specifying model parameters in [BAYES] bayesian
postestimation.

Substitutable expressions

You may use substitutable expressions in bayesmh to define nonlinear
expressions subexpr, arguments of outcome distributions in option
likelihood(), observation-level log likelihood in option llf(), arguments
of prior distributions in option prior(), and generic prior distributions
in prior()'s suboptions density() and logdensity().  Substitutable
expressions are just like any other mathematical expression in Stata,
except that they may include model parameters.

To specify a substitutable expression in your bayesmh model, you must
comply with the following rules:

1. Model parameters are bound in braces: {mu}, {var:sigma2}, {Sigma,
matrix}, and {Cov:Sigma, matrix}.

2. Linear combinations can be specified using the notation

{eqname:varlist[, xb noconstant]}

For example, {lc:mpg price weight} is equivalent to

{lc:mpg}*mpg + {lc:price}*price + {lc:weight}*weight +
{mpg:_cons}

The xb option is used to distinguish between the linear combination
that contains one variable and a free parameter that has the same name
as the variable and the same group name as the linear combination. For
example, {lc:weight, xb} is equivalent to {lc:_cons} +
{lc:weight}*weight, whereas {lc:weight} refers to either a free
parameter weight with a group name lc or the coefficient of the weight
variable, if {lc:} has been previously defined in the expression as a
linear combination that involves variable weight.  Thus the xb option
indicates that the specification is a linear combination rather than a
single parameter to be estimated.

When you define a linear combination, a constant term is included by
default.  The noconstant option suppresses the constant.

See Linear combinations in [ME] menl for details about specifying
linear combinations.

3. Initial values are given by including an equal sign and the initial
value inside the braces, for example, {b1=1.267}, {gamma=3}, etc.  If
you do not specify an initial value, that parameter is initialized to
one for positive scalar parameters and to zero for other scalar
parameters, or it is initialized to its MLE, if available.  The
initial() option overrides initial values provided in substitutable
expressions.  Initial values for matrices must be specified in the
initial() option. By default, matrix parameters are initialized with
identity matrices.

Examples

---------------------------------------------------------------------------
Setup
. webuse oxygen

Bayesian normal linear regression with noninformative priors
. set seed 14
. bayesmh change age group, likelihood(normal({var}))
prior({change:}, flat) prior({var}, jeffreys)

Bayesian normal linear regression with normal and inverse-gamma priors
. set seed 14
. bayesmh change age group, likelihood(normal({var}))
prior({change:}, normal(0, {var})) prior({var}, igamma(2.5, 2.5))

Bayesian normal linear regression with multivariate Zellnerâ€™s g-prior
. set seed 14
. bayesmh change age group, likelihood(normal({var}))
prior({change:}, zellnersg0(3,12,{var})) prior({var}, igamma(0.5,
4))

Update parameter {var} separately from other model coefficients
. set seed 14
. bayesmh change age group, likelihood(normal({var}))
prior({change:}, zellnersg0(3,12,{var})) prior({var}, igamma(0.5,
4)) block({var})

Use Gibbs sampling for parameter {var} and display the summary about
blocks
. set seed 14
. bayesmh change age group, likelihood(normal({var}))
prior({change:}, normal(0, 100)) prior({var}, igamma(0.5, 4))
block({var}, gibbs) blocksummary

Bayesian logistic regression model with a noninformative prior
. webuse hearthungary
. set seed 14
. bayesmh disease restecg isfbs age male, likelihood(logit)
prior({disease:}, normal(0,1000))

Bayesian ordered probit model including hyperparameter {lambda}
. webuse fullauto
. replace length = length/10
. set seed 14
. bayesmh rep77 foreign length mpg, likelihood(oprobit) prior({rep77:
foreign length mpg}, normal(0,1)) prior({rep77:_cut1 _cut2 _cut3
_cut4}, exponential({lambda=30})) prior({lambda}, uniform(10,40))
block(lambda)

---------------------------------------------------------------------------
Setup
. sysuse auto, clear
. replace weight = weight/1000
. replace length = length/100
. replace mpg = mpg/10

Bayesian multivariate normal model including matrix parameter {Sigma} for
the covariance matrix
. set seed 14
. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))
prior({mpg:_cons} {weight:_cons} {length:_cons}, normal(0,100))
prior({Sigma,m}, iwishart(3,100,I(3))) block({mpg:_cons}
{weight:_cons} {length:_cons}) block({Sigma,m}) dots

. set seed 14
. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))
prior({mpg:_cons} {weight:_cons} {length:_cons}, normal(0,100))
prior({Sigma,m}, iwishart(3,100,I(3))) block({mpg:_cons}
{weight:_cons} {length:_cons}) block({Sigma,m}) dots burnin(5000)

Request Gibbs sampling for covariance matrix {Sigma}
. set seed 14
. bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m}))
prior({mpg:_cons} {weight:_cons} {length:_cons}, normal(0,100))
prior({Sigma,m}, iwishart(3,100,I(3))) block({mpg:_cons}
{weight:_cons} {length:_cons}) block({Sigma,m}, gibbs) dots

---------------------------------------------------------------------------
Setup
. webuse pig, clear

Bayesian linear random-intercept model
. set seed 14
. fvset base none id
. bayesmh weight week i.id, likelihood(normal({var_0})) noconstant
prior({weight:i.id}, normal({weight:_cons},{var_id}))
prior({weight:_cons}, normal(0, 100)) prior({weight:week},
normal(0, 100)) prior({var_0}, igamma(0.001, 0.001))
prior({var_id}, igamma(0.001, 0.001)) mcmcsize(5000) dots

Bayesian linear random-intercept model using the reffects() option
. set seed 14
. bayesmh weight week, reffects(id) likelihood(normal({var_0}))
noconstant prior({weight:i.id}, normal({weight:_cons},{var_id}))
prior({weight:_cons}, normal(0, 100)) prior({weight:week},
normal(0, 100)) prior({var_0}, igamma(0.001, 0.001))
prior({var_id}, igamma(0.001, 0.001)) mcmcsize(5000) dots

---------------------------------------------------------------------------
Setup
. webuse coal

Analysis of a change point problem with target MCMC sample size of 20,000
. set seed 14
. bayesmh count,
likelihood(dpoisson({mu1}*sign(year<{cp})+{mu2}*sign(year>={cp}))
> ) prior({mu1} {mu2}, flat) prior({cp}, uniform(1851,1962))
initial({mu1} 1 {mu2} 1 {cp} 1906) mcmcsize(20000)

---------------------------------------------------------------------------

Video examples

Introduction to Bayesian statistics, part 1: The basic concepts

Introduction to Bayesian statistics, part 2: MCMC and the
Metropolis-Hastings algorithm

Stored results

bayesmh stores the following in e():

Scalars
e(N)                 number of observations
e(k)                 number of parameters
e(k_sc)              number of scalar parameters
e(k_mat)             number of matrix parameters
e(n_eq)              number of equations
e(mcmcsize)          MCMC sample size
e(burnin)            number of burn-in iterations
e(mcmciter)          total number of MCMC iterations
e(thinning)          thinning interval
e(arate)             overall AR
e(eff_min)           minimum efficiency
e(eff_avg)           average efficiency
e(eff_max)           maximum efficiency
e(clevel)            credible interval level
e(hpd)               1 if hpd is specified, 0 otherwise
e(batch)             batch length for batch-mean calculations
e(corrlag)           maximum autocorrelation lag
e(corrtol)           autocorrelation tolerance
e(dic)               deviation information criterion
e(lml_lm)            log marginal-likelihood using Laplace-Metropolis
method
e(scale)             initial multiplier for scale factor; scale()
e(block#_gibbs)      1 if Gibbs sampling is used in #th block, 0
otherwise
e(block#_reffects)   1 if the parameters in #th block are random
effects, 0 otherwise
e(block#_scale)      #th block initial multiplier for scale factor
e(block#_tarate)     #th block target adaptation rate
e(block#_arate_last) #th block AR from the last adaptive iteration
e(repeat)            number of attempts used to find feasible initial
values

Macros
e(cmd)               bayesmh
e(cmdline)           command as typed
e(method)            sampling method
e(depvars)           names of dependent variables
e(eqnames)           names of equations
e(likelihood)        likelihood distribution (one equation)
e(likelihood#)       likelihood distribution for #th equation
e(prior)             prior distribution
e(prior#)            prior distribution, if more than one prior() is
specified
e(priorparams)       parameter specification in prior()
e(priorparams#)      parameter specification from #th prior(), if more
than one prior() is specified
e(parnames)          names of model parameters except exclude()
e(postvars)          variable names corresponding to model parameters
in e(parnames)
e(subexpr)           substitutable expression
e(subexpr#)          substitutable expression, if more than one
e(wtype)             weight type (one equation)
e(wtype#)            weight type for #th equation
e(wexp)              weight expression (one equation)
e(wexp#)             weight expression for #th equation
e(block#_names)      parameter names from #th block
e(exclude)           names of excluded parameters
e(filename)          name of the file with simulation results
e(scparams)          scalar model parameters
e(matparams)         matrix model parameters
e(pareqmap)          model parameters in display order
e(title)             title in estimation output
e(rngstate)          random-number state at the time of simulation
e(search)            on, repeat(), or off

Matrices
e(mean)              posterior means
e(sd)                posterior standard deviations
e(mcse)              MCSE
e(median)            posterior medians
e(cri)               credible intervals
e(Cov)               variance-covariance matrix of parameters
e(ess)               effective sample sizes
e(init)              initial values vector

Functions
e(sample)            mark estimation sample

```