Stata 15 help for bayesmh

[BAYES] bayesmh -- Bayesian regression using Metropolis-Hastings algorithm

Syntax

Univariate linear models

bayesmh depvar [indepvars] [if] [in] [weight], likelihood(modelspec) prior(priorspec) [reffects(varname) options]

Multivariate linear models

Multivariate normal linear regression with common regressors

bayesmh depvars = [indepvars] [if] [in] [weight], likelihood(mvnormal(...)) prior(priorspec) [options]

Multivariate normal regression with outcome-specific regressors

bayesmh ([eqname1:]depvar1 [indepvars1]) ([eqname2:]depvar2 [indepvars2]) [...] [if] [in] [weight], likelihood(mvnormal(...)) prior(priorspec) [options]

Multiple-equation linear models

bayesmh (eqspec) [(eqspec)] [...] [if] [in] [weight], prior(priorspec) [options]

Nonlinear models

Univariate nonlinear regression

bayesmh depvar = (subexpr) [if] [in] [weight], likelihood(modelspec) prior(priorspec) [options]

Multivariate normal nonlinear regression

bayesmh (depvar1 = (subexpr1)) (depvar2 = (subexpr2)) [...] [if] [in] [weight], likelihood(mvnormal(...)) prior(priorspec) [options]

Probability distributions

Univariate distributions

bayesmh depvar [if] [in] [weight], likelihood(distribution) prior(priorspec) [options]

Multiple-equation distribution specifications

bayesmh (deqspec) [(deqspec)] [...] [if] [in] [weight], prior(priorspec) [options]

The syntax of eqspec is

varspec [if] [in] [weight], likelihood(modelspec) [noconstant]

The syntax of varspec is one of the following:

for single outcome

[eqname:]depvar [indepvars]

for multiple outcomes with common regressors

depvars = [indepvars]

for multiple outcomes with outcome-specific regressors

([eqname1:]depvar1 [indepvars1]) ([eqname2:]depvar2 [indepvars2]) [...]

The syntax of deqspec is

[eqname:] depvar [if] [in] [weight], likelihood(distribution)

subexpr, subexpr1, subexpr2, and so on are substitutable expressions; see Substitutable expressions for details.

The syntax of modelspec is

model [, modelopts]

model Description ------------------------------------------------------------------------- Model normal(var) normal regression with variance var t(sigma2, df) t regression with squared scale sigma2 and degrees of freedom df lognormal(var) lognormal regression with variance var lnormal(var) synonym for lognormal() exponential exponential regression mvnormal(Sigma) multivariate normal regression with covariance matrix Sigma

probit probit regression logit logistic regression logistic logistic regression; synonym for logit binomial(n) binomial regression with logit link and number of trials n binlogit(n) synonym for binomial() oprobit ordered probit regression ologit ordered logistic regression poisson Poisson regression

llf(subexpr) substitutable expression for observation-level log-likelihood function ------------------------------------------------------------------------- A distribution argument is a number for scalar arguments such as var; a variable name, varname (except for matrix arguments); a matrix for matrix arguments such as Sigma; a model parameter, paramspec; an expression, expr; or a substitutable expression, subexpr. See Specifying arguments of likelihood models and prior distributions.

modelopts Description ------------------------------------------------------------------------- Model offset(varname_o) include varname_o in model with coefficient constrained to 1; not allowed with normal() and mvnormal() exposure(varname_e) include ln(varname_e) in model with coefficient constrained to 1; allowed only with poisson -------------------------------------------------------------------------

distribution Description ------------------------------------------------------------------------- Model dexponential(beta) exponential distribution with scale parameter beta dbernoulli(p) Bernoulli distribution with success probability p dbinomial(p,n) binomial distribution with success probability p and number of trials n dpoisson(mu) Poisson distribution with mean mu ------------------------------------------------------------------------- A distribution argument is a model parameter, paramspec, or a substitutable expression, subexpr, containing model parameters. An n argument may be a number; an expression, expr; or a variable name, varname. See Specifying arguments of likelihood models and prior distributions.

The syntax of priorspec is

paramref, priordist

where the simplest specification of paramref is

paramspec [paramspec] [...]]

Also see Referring to model parameters for other specifications.

The syntax of paramspec is

{[eqname:]param[, matrix]}

where the parameter label eqname and parameter name param are valid Stata names. Model parameters are either scalars such as {var}, {mean}, {scale:beta}, or matrices such as {Sigma, matrix} and {Scale:V, matrix}. For scalar parameters, you can use {param=#} to specify an initial value. For example, you can specify, {var=1}, {mean=1.267}, or {shape:alpha=3}.

priordist Description ------------------------------------------------------------------------- Model normal(mu,var) normal with mean mu and variance var t(mu,sigma2,df) location-scale t with mean mu, squared scale sigma2, and degrees of freedom df lognormal(mu,var) lognormal with mean mu and variance var lnormal(mu,var) synonym for lognormal() uniform(a,b) uniform on (a,b) gamma(alpha,beta) gamma with shape alpha and scale beta igamma(alpha,beta) inverse gamma with shape alpha and scale beta exponential(beta) exponential with scale beta laplace(mu,beta) Laplace with mean mu and scale beta cauchy(loc,beta) Cauchy with location loc and scale beta beta(a,b) beta with shape parameters a and b chi2(df) central chi-squared with degrees of freedom df jeffreys Jeffreys prior for variance of a normal distribution

mvnormal(d,mean,Sigma) multivariate normal of dimension d with mean vector mean and covariance matrix Sigma; mean can be a matrix name or a list of d means separated by comma: mu1, mu2, ..., mud mvnormal0(d,Sigma) multivariate normal of dimension d with zero mean vector and covariance matrix Sigma mvn0(d,Sigma) synonym for mvnormal0() zellnersg(d,g,mean,{var}) Zellner's g-prior of dimension d with g degrees of freedom, mean vector mean, and variance parameter {var}; mean can be a matrix name or a list of d means separated by comma: mu1, mu2, ..., mud zellnersg0(d,g,{var}) Zellner's g-prior of dimension d with g degrees of freedom, zero mean vector, and variance parameter {var} wishart(d,df,V) Wishart of dimension d with degrees of freedom df and scale matrix V iwishart(d,df,V) inverse Wishart of dimension d with degrees of freedom df and scale matrix V jeffreys(d) Jeffreys prior for covariance of a multivariate normal distribution of dimension d

bernoulli(p) Bernoulli with success probability p index(p1,...,pk) discrete indices 1, 2, ..., k with probabilities p1, p2, ..., pk poisson(mu) Poisson with mean mu

flat flat prior; equivalent to density(1) or logdensity(0) density(f) generic density f logdensity(logf) generic logdensity logf ------------------------------------------------------------------------- Dimension d is a positive #. A distribution argument is a number for scalar arguments such as var, alpha, beta; a Stata matrix for matrix arguments such as Sigma and V; a model parameter, paramspec; an expression, expr; or a substitutable expression subexpr. See Specifying arguments of likelihood models and prior distributions. f is a nonnegative number, #; an expression expr; or a substitutable expression, subexpr. logf is a number, #; an expression, expr; or a substitutable expression, subexpr. When mvnormal() or mvnormal0() of dimension d is applied to paramref with n parameters (n!=d), paramref is reshaped into a matrix with d columns, and its rows are treated as independent samples from the specified mvnormal() distribution. If such reshaping is not possible, an error is issued. See example 25 for application of this feature.

options Description ------------------------------------------------------------------------- Model noconstant suppress constant term; not allowed with ordered models, nonlinear models, and probability distributions * likelihood(lspec) distribution for the likelihood model * prior(priorspec) prior for model parameters; this option may be repeated dryrun show model summary without estimation

Model 2 redefine(label:i.varname) specify a random-effects linear form; this option may be repeated xbdefine(label:varlist) specify a linear form

Simulation

mcmcsize(#) MCMC sample size; default is mcmcsize(10000) burnin(#) burn-in period; default is burnin(2500) thinning(#) thinning interval; default is thinning(1) rseed(#) random-number seed exclude(paramref) specify model parameters to be excluded from the simulation results

Blocking

block(paramref[, blockopts]) specify a block of model parameters; this option may be repeated blocksummary display block summary

Initialization

initial(initspec) initial values for model parameters nomleinitial suppress the use of maximum likelihood estimates as starting values initrandom specify random initial values initsummary display initial values used for simulation

Adaptation

adaptation(adaptopts) control the adaptive MCMC procedure scale(#) initial multiplier for scale factor; default is scale(2.38) covariance(cov) initial proposal covariance; default is the identity matrix

Reporting

clevel(#) set credible interval level; default is clevel(95) hpd display HPD credible intervals instead of the default equal-tailed credible intervals eform[(string)] report exponentiated coefficients and, optionally, label as string batch(#) specify length of block for batch-means calculations; default is batch(0) saving(filename, replace) save simulation results to filename.dta nomodelsummary suppress model summary noexpression suppress output of expressions from model summary [no]dots suppress dots or display dots every 100 iterations and iteration numbers every 1,000 iterations; default is nodots dots(#[, every(#)]) display dots as simulation is performed [no]show(paramref) specify model parameters to be excluded from or included in the output showreffects[(reref)] specify that all or a subset of random-effects parameters be included in the output notable suppress estimation table noheader suppress output header title(string) display string as title above the table of parameter estimates display_options control spacing, line width, and base and empty cells

Advanced search(search_options) control the search for feasible initial values corrlag(#) specify maximum autocorrelation lag; default varies corrtol(#) specify autocorrelation tolerance; default is corrtol(0.01) ------------------------------------------------------------------------- * Options likelihood() and prior() are required. prior() must be specified for all model parameters. Options prior(), redefine(), and block() can be repeated. indepvars and paramref may contain factor variables; see fvvarlist. With multiple-equations specifications, a local if specified within an equation is applied together with the global if specified with the command. Only fweights are allowed; see weight. With multiple-equations specifications, local weights or (weights specified within an equation) override global weights (weights specified with the command). See [BAYES] bayesian postestimation for features available after estimation.

blockopts Description ------------------------------------------------------------------------- gibbs requests Gibbs sampling; available for selected models only and not allowed with scale(), covariance(), or adaptation() split requests that all parameters in a block be treated as separate blocks reffects requests that all parameters in a block be treated as random-effects parameters scale(#) initial multiplier for scale factor for current block; default is scale(2.38); not allowed with gibbs covariance(cov) initial proposal covariance for the current block; default is the identity matrix; not allowed with gibbs adaptation(adaptopts) control the adaptive MCMC procedure of the current block; not allowed with gibbs ------------------------------------------------------------------------- Only tarate() and tolerance() may be specified in the adaptation() option.

adaptopts Description ------------------------------------------------------------------------- every(#) adaptation interval; default is every(100) maxiter(#) maximum number of adaptation loops; default is maxiter(25) or max{25,floor(burnin()/every())} whenever default values of these options are modified miniter(#) minimum number of adaptation loops; default is miniter(5) alpha(#) parameter controlling acceptance rate (AR); default is alpha(0.75) beta(#) parameter controlling proposal covariance; default is beta(0.8) gamma(#) parameter controlling adaptation rate; default is gamma(0) * tarate(#) target acceptance rate (TAR); default is parameter specific * tolerance(#) tolerance for AR; default is tolerance(0.01) ------------------------------------------------------------------------- * Only starred options may be specified in the adaptation() option specified within block().

Menu

Statistics > Bayesian analysis > General estimation and regression

Description

bayesmh fits a variety of Bayesian models using an adaptive Metropolis-Hastings (MH) algorithm. It provides various likelihood models and prior distributions for you to choose from. Likelihood models include univariate normal linear and nonlinear regressions, multivariate normal linear and nonlinear regressions, generalized linear models such as logit and Poisson regressions, and multiple-equations linear models. Prior distributions include continuous distributions such as uniform, Jeffreys, normal, gamma, multivariate normal, and Wishart and discrete distributions such as Bernoulli and Poisson. You can also program your own Bayesian models; see [BAYES] bayesmh evaluators.

Also see [BAYES] bayesian estimation for a list of Bayesian regression models that can be fit more conveniently with the bayes prefix ([BAYES] bayes).

Options

+-------+ ----+ Model +------------------------------------------------------------

noconstant suppresses the constant term (intercept) from the regression model. By default, bayesmh automatically includes a model parameter {depname:_cons} in all regression models except ordered and nonlinear models. Excluding the constant term may be desirable when there is a factor variable, the base level of which absorbs the constant term in the linear combination.

likelihood(lspec) specifies the distribution of the data. This option specifies the likelihood portion of the Bayesian model. This option is required. lspec is one of modelspec or distribution.

modelspec specifies one of the supported likelihood distributions for regression models. A location parameter of these distributions is automatically parameterized as a linear combination of the specified independent variables and needs not be specified. Other parameters may be specified as arguments to the distribution separated by commas. Each argument may be a real number (#), a variable name (except for matrix parameters), a predefined matrix, a model parameter specified in { }, a Stata expression, or a substitutable expression containing model parameters; see Declaring model parameters and Specifying arguments of likelihood models and prior distributions in [BAYES] bayesmh.

distribution specifies one of the supported distributions for modeling the dependent variable. A distribution argument must be a model parameter specified in { } or a substitutable expression containing model parameters; see Declaring model parameters and Specifying arguments of likelihood models and prior distributions in [BAYES] bayesmh. A number of trials, n, of the binomial distribution may be a real number (#), a Stata expression, or a variable name. For an example of modeling outcome distributions directly, see Beta-binomial model in [BAYES] bayesmh.

For some regression models, option likelihood() provides suboptions subopts in likelihood(..., subopts). subopts is offset() and exposure().

offset(varname_o) specifies that varname_o be included in the regression model with the coefficient constrained to be 1. This option is available with probit, logit, binomial(), binlogit(), oprobit, ologit, and poisson.

exposure(varname_e) specifies a variable that reflects the amount of exposure over which the depvar events were observed for each observation; ln(varname_e) with coefficient constrained to be 1 is entered into the log-link function. This option is available with poisson.

prior(priorspec) specifies a prior distribution for model parameters. This option is required and may be repeated. A prior must be specified for each model parameter. Model parameters may be scalars or matrices, but both types may not be combined in one prior statement. If multiple scalar parameters are assigned a single univariate prior, they are considered independent, and the specified prior is used for each parameter. You may assign a multivariate prior of dimension d to d scalar parameters. Also see Referring to model parameters below and Specifying arguments of likelihood models and prior distributions in [BAYES] bayesmh.

All likelihood() and prior() combinations are allowed, but they are not guaranteed to correspond to proper posterior distributions. You need to think carefully about the model you are building and evaluate its convergence thoroughly.

dryrun specifies to show the summary of the model that would be fit without actually fitting the model. This option is recommended for checking specifications of the model before fitting the model. The model summary reports the information about the likelihood model and about priors for all model parameters.

+---------+ ----+ Model 2 +----------------------------------------------------------

reffects(varname) specifies a random-effects variable, a variable identifying the group structure for the random effects, with univariate linear models. This option is useful for fitting two-level random-intercept models. A random-effects variable is treated as a factor variable with no base level. As such, you can refer to random-effects parameters or, simply, random effects associated with varname using a conventional factor-variable notation. For example, you can use {depvar:i.varname} to refer to all random-effects parameters of varname. These parameters must be included in a single prior statement, usually a normal distribution with variance specified by an additional parameter. The random-effects parameters are assumed to be conditionally independent across levels of varname given all other model parameters. The random-effects parameters are automatically grouped in one block and are thus not allowed in the block() option. See example 23.

redefine(label:i.varname) specifies a random-effects linear form that can be used in substitutable expressions. You can use {label:} to refer to the linear form in substitutable expressions. You can specify {label:i.varname} to refer to all random-effects parameters associated with varname. The random-effects parameters are automatically grouped in one block and are thus not allowed in the block() option. This option is useful for fitting multilevel models and can be repeated. See example 29.

xbdefine(label:varlist) specifies a linear form of the variables in varlist that can be used in substitutable expressions. You can use the specification {label:} to refer to the linear form in substitutable expressions. For any varname in varlist, you can use {label:varname} to refer to the corresponding parameter. This option is useful with nonlinear specifications when the linear form contains many variables and provides more efficient computation in such cases.

+------------+ ----+ Simulation +-------------------------------------------------------

mcmcsize(#) specifies the target MCMC sample size. The default MCMC sample size is mcmcsize(10000). The total number of iterations for the MH algorithm equals the sum of the burn-in iterations and the MCMC sample size in the absence of thinning. If thinning is present, the total number of MCMC iterations is computed as burnin() + (mcmcsize() - 1) x thinning() + 1. Computation time of the MH algorithm is proportional to the total number of iterations. The MCMC sample size determines the precision of posterior summaries, which may be different for different model parameters and will depend on the efficiency of the Markov chain. Also see Burn-in period and MCMC sample size in [BAYES] bayesmh.

burnin(#) specifies the number of iterations for the burn-in period of MCMC. The values of parameters simulated during burn-in are used for adaptation purposes only and are not used for estimation. The default is burnin(2500). Typically, burn-in is chosen to be as long as or longer than the adaptation period. Also see Burn-in period and MCMC sample size and Convergence of MCMC in [BAYES] bayesmh.

thinning(#) specifies the thinning interval. Only simulated values from every (1+k x #)th iteration for k = 0, 1, 2, ... are saved in the final MCMC sample; all other simulated values are discarded. The default is thinning(1); that is, all simulation values are saved. Thinning greater than one is typically used for decreasing the autocorrelation of the simulated MCMC sample.

rseed(#) sets the random-number seed. This option can be used to reproduce results. rseed(#) is equivalent to typing set seed # prior to calling bayesmh; see [R] set seed and Reproducing results in [BAYES] bayesmh.

exclude(paramref) specifies which model parameters should be excluded from the final MCMC sample. These model parameters will not appear in the estimation table, and postestimation features for these parameters and log marginal likelihood will not be available. This option is useful for suppressing nuisance model parameters. For example, if you have a factor predictor variable with many levels but you are only interested in the variability of the coefficients associated with its levels, not their actual values, then you may wish to exclude this factor variable from the simulation results. If you simply want to omit some model parameters from the output, see the noshow() option. paramref can include individual random-effects parameters.

+----------+ ----+ Blocking +---------------------------------------------------------

block(paramref[, blockopts]) specifies a group of model parameters for the blocked MH algorithm. By default, all parameters except matrices are treated as one block, and each matrix parameter is viewed as a separate block. You can use the block() option to separate scalar parameters in multiple blocks. Technically, you can also use block() to combine matrix parameters in one block, but this is not recommended. The block() option may be repeated to define multiple blocks. Different types of model parameters, such as scalars and matrices, may not be specified in one block(). Parameters within one block are updated simultaneously, and each block of parameters is updated in the order it is specified; the first specified block is updated first, the second is updated second, and so on. See Improving efficiency of the MH algorithm---blocking of parameters in [BAYES] bayesmh.

blockopts include gibbs, split, reffects, scale(), covariance(), and adaptation().

gibbs specifies to use Gibbs sampling to update parameters in the block. This option is allowed only for specific combinations of likelihood models and prior distributions; see Gibbs sampling for some likelihood-prior and prior-hyperprior configurations in [BAYES] bayesmh. For more information, see Gibbs and hybrid MH sampling in [BAYES] bayesmh. gibbs may not be combined with reffects, scale(), covariance(), or adaptation().

split specifies that all parameters in a block are treated as separate blocks. This may be useful for levels of factor variables.

reffects specifies that the parameters associated with the levels of a factor variable included in the likelihood specification be treated as random-effects parameters. Random-effects parameters must be included in one prior statement and are assumed to be conditionally independent across levels of a grouping variable given all other model parameters. reffects requires that parameters be specified as {depvar:i.varname}, where i.varname is the corresponding factor variable in the likelihood specification, and may not be combined with block()'s suboptions gibbs and split. This option is useful for fitting hierarchical or multilevel models. See example 25 in [BAYES] bayesmh for details.

scale(#) specifies an initial multiplier for the scale factor corresponding to the specified block. The initial scale factor is computed as #/sqrt{n_p} for continuous parameters and as #/n_p for discrete parameters, where n_p is the number of parameters in the block. The default is scale(2.38). If specified, this option overrides the respective setting from the scale() option specified with the command. scale() may not be combined with gibbs.

covariance(matname) specifies a scale matrix matname to be used to compute an initial proposal covariance matrix corresponding to the specified block. The initial proposal covariance is computed as rho x Sigma, where rho is a scale factor and Sigma = matname. By default, Sigma is the identity matrix. If specified, this option overrides the respective setting from the covariance() option specified with the command. covariance() may not be combined with gibbs.

adaptation(tarate()) and adaptation(tolerance()) specify block-specific TAR and acceptance tolerance. If specified, they override the respective settings from the adaptation() option specified with the command. adaptation() may not be combined with gibbs.

blocksummary displays the summary of the specified blocks. This option is useful when block() is specified.

+----------------+ ----+ Initialization +---------------------------------------------------

initial(initspec) specifies initial values for the model parameters to be used in the simulation. You can specify a parameter name, its initial value, another parameter name, its initial value, and so on. For example, to initialize a scalar parameter alpha to 0.5 and a 2x2 matrix Sigma to the identity matrix I(2), you can type

bayesmh ..., initial({alpha} 0.5 {Sigma,m} I(2)) ...

You can also specify a list of parameters using any of the specifications described in Referring to model parameters in [BAYES] bayesmh. For example, to initialize all regression coefficients from equations y1 and y2 to zero, you can type

bayesmh ..., initial({y1:} {y2:} 0) ...

The general specification of initspec is

paramref # [paramref # []]

Curly braces may be omitted for scalar parameters but must be specified for matrix parameters. Initial values declared using this option override the default initial values or any initial values declared during parameter specification in the likelihood() option. See Specifying initial values in [BAYES] bayesmh for details.

nomleinitial suppresses using maximum likelihood estimates (MLEs) starting values for model parameters. By default, when no initial values are specified, MLE values (when available) are used as initial values. If nomleinitial is specified and no initial values are provided, the command uses ones for positive scalar parameters, zeros for other scalar parameters, and identity matrices for matrix parameters. nomleinitial may be useful for providing an alternative starting state when checking convergence of MCMC. This option cannot be combined with initrandom.

initrandom requests that the model parameters be initialized randomly. Random initial values are generated from the prior distributions of the model parameters. If you want to use fixed initial values for some of the parameters, you can specify them in the initial() option or during parameter declarations in the likelihood() option. Random initial values are not available for parameters with flat, density(), logdensity(), and jeffreys() priors; you must provide fixed initial values for such parameters. This option cannot be combined with nomleinitial.

initsummary specifies that the initial values used for simulation be displayed.

+------------+ ----+ Adaptation +-------------------------------------------------------

adaptation(adaptopts) controls adaptation of the MCMC procedure. Adaptation takes place every prespecified number of MCMC iterations and consists of tuning the proposal scale factor and proposal covariance for each block of model parameters. Adaptation is used to improve sampling efficiency. Provided defaults are based on theoretical results and may not be sufficient for all applications. See Adaptation of the MH algorithm in [BAYES] bayesmh for details about adaptation and its parameters.

adaptopts are any of the following options:

every(#) specifies that adaptation be attempted every #th iteration. The default is every(100). To determine the adaptation interval, you need to consider the maximum block size specified in your model. The update of a block with k model parameters requires the estimation of a k x k covariance matrix. If the adaptation interval is not sufficient for estimating the k(k+1)/2 elements of this matrix, the adaptation may be insufficient.

maxiter(#) specifies the maximum number of adaptive iterations. Adaptation includes tuning of the proposal covariance and of the scale factor for each block of model parameters. Once the TAR is achieved within the specified tolerance, the adaptation stops. However, no more than # adaptation steps will be performed. The default is variable and is computed as max{25,floor(burnin()/adaptation(every()))}.

maxiter() is usually chosen to be no greater than (mcmcsize()+burnin())/adaptation(every()).

miniter(#) specifies the minimum number of adaptive iterations to be performed regardless of whether the TAR has been achieved. The default is miniter(5). If the specified miniter() is greater than maxiter(), then miniter() is reset to maxiter(). Thus, if you specify maxiter(0), then no adaptation will be performed.

alpha(#) specifies a parameter controlling the adaptation of the AR. alpha() should be in [0,1]. The default is alpha(0.75).

beta(#) specifies a parameter controlling the adaptation of the proposal covariance matrix. beta() must be in [0,1]. The closer beta() is to zero, the less adaptive the proposal covariance. When beta() is zero, the same proposal covariance will be used in all MCMC iterations. The default is beta(0.8).

gamma(#) specifies a parameter controlling the adaptation rate of the proposal covariance matrix. gamma() must be in [0,1]. The larger the value of gamma(), the less adaptive the proposal covariance. The default is gamma(0).

tarate(#) specifies the TAR for all blocks of model parameters; this is rarely used. tarate() must be in (0,1). The default AR is 0.234 for blocks containing continuous multiple parameters, 0.44 for blocks with one continuous parameter, and 1/n_maxlev for blocks with discrete parameters, where n_maxlev is the maximum number of levels for a discrete parameter in the block.

tolerance(#) specifies the tolerance criterion for adaptation based on the TAR. tolerance() should be in (0,1). Adaptation stops whenever the absolute difference between the current AR and TAR is less than tolerance(). The default is tolerance(0.01).

scale(#) specifies an initial multiplier for the scale factor for all blocks. The initial scale factor is computed as #/sqrt{n_p} for continuous parameters and #/n_p for discrete parameters, where n_p is the number of parameters in the block. The default is scale(2.38).

covariance(cov) specifies a scale matrix cov to be used to compute an initial proposal covariance matrix. The initial proposal covariance is computed as rho x Sigma, where rho is a scale factor and Sigma = matname. By default, Sigma is the identity matrix. Partial specification of Sigma is also allowed. The rows and columns of cov should be named after some or all model parameters. According to some theoretical results, the optimal proposal covariance is the posterior covariance matrix of model parameters, which is usually unknown. This option does not apply to the blocks containing random-effects parameters.

+-----------+ ----+ Reporting +--------------------------------------------------------

clevel(#) specifies the credible level, as a percentage, for equal-tailed and HPD credible intervals. The default is clevel(95) or as set by [BAYES] set clevel.

hpd specifies the display of HPD credible intervals instead of the default equal-tailed credible intervals.

eform and eform(string) specify that the coefficient table be displayed in exponentiated form and that exp(b) and string, respectively, be used to label the exponentiated coefficients in the table.

batch(#) specifies the length of the block for calculating batch means, batch standard deviation, and MCSE using batch means. The default is batch(0), which means no batch calculations. When batch() is not specified, MCSE is computed using effective sample sizes instead of batch means. Option batch() may not be combined with corrlag() or corrtol().

saving(filename[, replace]) saves simulation results in filename.dta. The replace option specifies to overwrite filename.dta if it exists. If the saving() option is not specified, bayesmh saves simulation results in a temporary file for later access by postestimation commands. This temporary file will be overridden every time bayesmh is run and will also be erased if the current estimation results are cleared. saving() may be specified during estimation or on replay.

The saved dataset has the following structure. Variance _index records iteration numbers. bayesmh saves only states (sets of parameter values) that are different from one iteration to another and the frequency of each state in variable _frequency. (Some states may be repeated for discrete parameters.) As such, _index may not necessarily contain consecutive integers. Remember to use _frequency as a frequency weight if you need to obtain any summaries of this dataset. Values for each parameter are saved in a separate variable in the dataset. Variables containing values of parameters without equation names are named as eq0_p#, following the order in which parameters are declared in bayesmh. Variables containing values of parameters with equation names are named as eq#_p#, again following the order in which parameters are defined. Parameters with the same equation names will have the same variable prefix eq#. For example,

. bayesmh y x1, likelihood(normal({var})) saving(mcmc) ...

will create a dataset, mcmc.dta, with variable names eq1_p1 for {y:x1}, eq1_p2 for {y:_cons}, and eq0_p1 for {var}. Also see macros e(parnames) and e(varnames) for the correspondence between parameter names and variable names.

In addition, bayesmh saves variable _loglikelihood to contain values of the log likelihood from each iteration and variable _logposterior to contain values of the log posterior from each iteration.

nomodelsummary suppresses the detailed summary of the specified model. The model summary is reported by default.

noexpression suppresses the output of expressions from the model summary. Expressions (when specified) are reported by default.

nodots, dots, and dots(#) specify to suppress or display dots during simulation. dots(#) displays a dot every # iterations. During the adaptation period, a symbol a is displayed instead of a dot. If dots(..., every(#)) is specified, then an iteration number is displayed every #th iteration instead of a dot or a. dots(, every(#)) is equivalent to dots(1, every(#)). dots displays dots every 100 iterations and iteration numbers every 1,000 iterations; it is a synonym for dots(100), every(1000). By default, no dots are displayed (nodots or dots(0)).

show(paramref) or noshow(paramref) specifies a list of model parameters to be included in the output or excluded from the output, respectively. By default, all model parameters (except random-effects parameters when reffects() is specified) are displayed. Do not confuse noshow() with exclude(), which excludes the specified parameters from the MCMC sample. When the noshow() option is specified, for computational efficiency, MCMC summaries of the specified parameters are not computed or stored in e(). paramref can include individual random-effects parameters.

showreffects and showreffects(reref) are used with option reffects() and specify that all or a list reref of random-effects parameters be included in the output in addition to other model parameters. By default, all random-effects parameters introduced by reffects() are excluded from the output as if you have specified the noshow() option. This option computes, displays, and stores in e() MCMC summaries for the first #_matsize-#_npar random-effects parameters, where #_matsize is the maximum number of variables as determined by matsize (see [R] matsize) and #_npar is the number of other model parameters displayed. If you want to obtain MCMC summaries and display other random-effects parameters, you can use the show() option or use bayesstats summary (see [BAYES] bayesstats summary).

notable suppresses the estimation table from the output. By default, a summary table is displayed containing all model parameters except those listed in the exclude() and noshow() options. Regression model parameters are grouped by equation names. The table includes six columns and reports the following statistics using the MCMC simulation results: posterior mean, posterior standard deviation, MCMC standard error or MCSE, posterior median, and credible intervals.

noheader suppresses the output header either at estimation or upon replay.

title(string) specifies an optional title for the command that is displayed above the table of the parameter estimates. The default title is specific to the specified likelihood model.

display_options: vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), fvwrapon(style), and nolstretch; see [R] estimation options.

+----------+ ----+ Advanced +---------------------------------------------------------

search(search_options) searches for feasible initial values. search_options are on, repeat(#), and off.

search(on) is equivalent to search(repeat(500)). This is the default.

search(repeat(k)), k>0, specifies the number of random attempts to be made to find a feasible initial-value vector, or initial state. The default is repeat(500). An initial-value vector is feasible if it corresponds to a state with positive posterior probability. If feasible initial values are not found after k attempts, an error will be issued. repeat(0) (rarely used) specifies that no random attempts be made to find a feasible starting point. In this case, if the specified initial vector does not correspond to a feasible state, an error will be issued.

search(off) prevents the command from searching for feasible initial values. We do not recommend specifying this option.

corrlag(#) specifies the maximum autocorrelation lag used for calculating effective sample sizes. The default is min{500,mcmcsize()/2}. The total autocorrelation is computed as the sum of all lag-k autocorrelation values for k from 0 to either corrlag() or the index at which the autocorrelation becomes less than corrtol() if the latter is less than corrlag(). Options corrlag() and batch() may not be combined.

corrtol(#) specifies the autocorrelation tolerance used for calculating effective sample sizes. The default is corrtol(0.01). For a given model parameter, if the absolute value of the lag-k autocorrelation is less than corrtol(), then all autocorrelation lags beyond the kth lag are discarded. Options corrtol() and batch() may not be combined.

Remarks

Remarks are presented under the following headings:

Using bayesmh Declaring model parameters Referring to model parameters Substitutable expressions

Using bayesmh

The bayesmh command for Bayesian analysis includes three functional components: setting up a posterior model, performing MCMC simulation, and summarizing and reporting results. The first component, the model-building step, requires some experience in the practice of Bayesian statistics and, as any modeling task, is probably the most demanding. You should specify a posterior model that is statistically correct and that represents the observed data. Another important aspect is the computational feasibility of the model in the context of the MH MCMC procedure implemented in bayesmh. The provided MH algorithm is adaptive and, to a degree, can accommodate various statistical models and data structures. However, careful model parametrization and well-specified initial values and MCMC sampling scheme are crucial for achieving a fast-converging Markov chain and consequently good results. Simulation of MCMC must be followed by a thorough investigation of the convergence of the MCMC algorithm. Once you are satisfied with the convergence of the simulated chains, you may proceed with posterior summaries of the results and their interpretation. Below we discuss the three major steps of using bayesmh and provide recommendations.

Declaring model parameters

Model parameters are typically declared, meaning first introduced, in the arguments of distributions specified in options likelihood() and prior(). We will refer to model parameters that are declared in the prior distributions (and not the likelihood distributions) as hyperparameters. Model parameters may also be declared within the parameter specification of the prior() option, but this is more rare.

bayesmh distinguishes between two types of model parameters: scalar and matrix. All parameters must be specified in curly braces, { }. There are two ways for declaring a scalar parameter: param and {eqname:param}, where param and eqname are valid Stata names.

The specification of a matrix parameter is similar, but you must use the matrix suboptions: {param, matrix} and {eqname:param, matrix}. The most common application of matrix model parameters is for specifying the variance-covariance matrix of a multivariate normal distribution.

All matrices are assumed to be symmetric and only the elements in the lower diagonal are reported in the output. Only a few multivariate prior distributions are available for matrix parameters: wishart(), iwishart(), and jeffreys(). In addition to being symmetric, these distributions require that the matrices be positive definite.

It is your responsibility to declare all parameters of your model, except regression coefficients in linear models. For a linear model, bayesmh automatically creates a regression coefficient with the name {depvar:indepvar} for each independent variable indepvar in the model and, if noconstant is not specified, an intercept parameter {depvar}. In the presence of factor variables, bayesmh will create a parameter {depvar:level} for each level indicator level and a parameter {depvar:inter} for each interaction indicator inter; see fvvarlists. (It is still your responsibility, however, to specify prior distributions for the regression parameters.)

For example,

. bayesmh y x, ...

will automatically have two regression parameters: {y:x} and {y}, whereas

. bayesmh y x, noconstant ...

will have only one: {y:x}.

For a univariate normal linear regression, we may want to additionally declare the scalar variance parameter by

. bayesmh y x, likelihood(normal({sig2})) ...

We can label the variance parameter, as follows:

. bayesmh y x, likelihood(normal({var:sig2})) ...

We can declare a hyperparameter for {sig2} using

. bayesmh y x, likelihood(normal({sig2})) prior({sig2}, igamma({df},2)) ...

where the hyperparameter {df} is declared in the inverse-gamma prior distribution for {sig2}.

For a multivariate normal linear regression, in addition to four regression parameters declared automatically by bayesmh: {y1:x}, {y1}, {y2:x}, and {y2}, we may also declare a parameter for the variance-covariance matrix:

. bayesmh y1 y2 = x, likelihood(mvnormal({Sigma, matrix})) ...

or abbreviate matrix to m for short:

. bayesmh y1 y2 = x, likelihood(mvnormal({Sigma, m})) ...

Referring to model parameters

After a model parameter is declared, we may need to refer to it in our further model specification. We will definitely need to refer to it when we specify its prior distribution. We may also need to use it as an argument in the prior distributions of other parameters or need to specify it in the block() option for blocking of model parameters; see Improving efficiency of the MH algorithm---blocking of parameters in [BAYES] bayesmh.

To refer to one parameter, we simply use its definition: {param}, {eqname:param}, {param, matrix}, or {eqname:param, matrix}. There are several ways in which you can refer to multiple parameters. You can refer to multiple model parameters in the parameter specification paramref of the prior(paramref, ...) option, of the block(paramref, ...) option, or of the initial(paramref #) option.

The most straightforward way to refer to multiple scalar model parameters is to simply list them individually, as follows:

{param1} {param2} ...

but there are shortcuts.

For example, the alternative to the above is

{param1 param2} ...

where we simply list the names of all parameters inside one set of curly braces.

If parameters have the same equation name, you can refer to all of the parameters with that equation name as follows. Suppose that we have three parameters with the same equation name eqname, then the specification

{eqname:param1} {eqname:param2} {eqname:param3}

is the same as the specification

{eqname:}

or the specification

{eqname:param1 param2 param3}

The above specification is useful if we want to refer to a subset of parameters with the same equation name. For example, in the above, if we wanted to refer to only param1 and param2, we could type

{eqname:param1 param2}

If a factor variable is used in the specification of the regression function, you can use the same factor-variable specification within paramref to refer to the coefficients associated with the levels of that factor variable; see fvvarlists.

For example, factor variables are useful for constructing multilevel Bayesian models. Suppose that variable id defines the second level of hierarchy in a two-level random-effects model. We can fit a Bayesian random-intercept model as follows.

. bayesmh y x i.id, likelihood(normal({var})) prior({y:i.id}, normal(0,{tau})) ...

Here we used {y:i.id} in the prior specification to refer to all levels of id.

Similarly, we can add a random coefficient for a continuous covariate x by typing

. bayesmh y c.x##i.id, likelihood(normal({var})) prior({y:i.id}, normal(0,{tau1})) prior({y:c.x#i.id}, normal(0,{tau2})) ...

You can mix and match all of the specifications above in one parameter specification, paramref.

To refer to multiple matrix model parameters, you can use {paramlist, matrix} to refer to matrix parameters with names paramlist and {eqname:paramlist, matrix} to refer to matrix parameters with names in paramlist and with equation name eqname.

For example, the specification

{eqname:Sigma1,m} {eqname:Sigma2,m} {Sigma3,m} {Sigma4,m}

is the same as the specification

{eqname:Sigma1 Sigma2,m} {Sigma3 Sigma4,m}

You cannot refer to both scalar and matrix parameters in one paramref specification.

For referring to model parameters in postestimation commands, see Different ways of specifying model parameters in [BAYES] bayesian postestimation.

Substitutable expressions

You may use substitutable expressions in bayesmh to define nonlinear expressions subexpr, arguments of outcome distributions in option likelihood(), observation-level log likelihood in option llf(), arguments of prior distributions in option prior(), and generic prior distributions in prior()'s suboptions density() and logdensity(). Substitutable expressions are just like any other mathematical expression in Stata, except that they may include model parameters.

To specify a substitutable expression in your bayesmh model, you must comply with the following rules:

1. Model parameters are bound in braces: {mu}, {var:sigma2}, {Sigma, matrix}, and {Cov:Sigma, matrix}.

2. Linear combinations can be specified using the notation

{eqname:varlist[, xb noconstant]}

For example, {lc:mpg price weight} is equivalent to

{lc:mpg}*mpg + {lc:price}*price + {lc:weight}*weight + {mpg:_cons}

The xb option is used to distinguish between the linear combination that contains one variable and a free parameter that has the same name as the variable and the same group name as the linear combination. For example, {lc:weight, xb} is equivalent to {lc:_cons} + {lc:weight}*weight, whereas {lc:weight} refers to either a free parameter weight with a group name lc or the coefficient of the weight variable, if {lc:} has been previously defined in the expression as a linear combination that involves variable weight. Thus the xb option indicates that the specification is a linear combination rather than a single parameter to be estimated.

When you define a linear combination, a constant term is included by default. The noconstant option suppresses the constant.

See Linear combinations in [ME] menl for details about specifying linear combinations.

3. Initial values are given by including an equal sign and the initial value inside the braces, for example, {b1=1.267}, {gamma=3}, etc. If you do not specify an initial value, that parameter is initialized to one for positive scalar parameters and to zero for other scalar parameters, or it is initialized to its MLE, if available. The initial() option overrides initial values provided in substitutable expressions. Initial values for matrices must be specified in the initial() option. By default, matrix parameters are initialized with identity matrices.

Examples

--------------------------------------------------------------------------- Setup . webuse oxygen Bayesian normal linear regression with noninformative priors . set seed 14 . bayesmh change age group, likelihood(normal({var})) prior({change:}, flat) prior({var}, jeffreys)

Bayesian normal linear regression with normal and inverse-gamma priors . set seed 14 . bayesmh change age group, likelihood(normal({var})) prior({change:}, normal(0, {var})) prior({var}, igamma(2.5, 2.5)) Bayesian normal linear regression with multivariate Zellner’s g-prior . set seed 14 . bayesmh change age group, likelihood(normal({var})) prior({change:}, zellnersg0(3,12,{var})) prior({var}, igamma(0.5, 4)) Update parameter {var} separately from other model coefficients . set seed 14 . bayesmh change age group, likelihood(normal({var})) prior({change:}, zellnersg0(3,12,{var})) prior({var}, igamma(0.5, 4)) block({var})

Use Gibbs sampling for parameter {var} and display the summary about blocks . set seed 14 . bayesmh change age group, likelihood(normal({var})) prior({change:}, normal(0, 100)) prior({var}, igamma(0.5, 4)) block({var}, gibbs) blocksummary

Bayesian logistic regression model with a noninformative prior . webuse hearthungary . set seed 14 . bayesmh disease restecg isfbs age male, likelihood(logit) prior({disease:}, normal(0,1000))

Bayesian ordered probit model including hyperparameter {lambda} . webuse fullauto . replace length = length/10 . set seed 14 . bayesmh rep77 foreign length mpg, likelihood(oprobit) prior({rep77: foreign length mpg}, normal(0,1)) prior({rep77:_cut1 _cut2 _cut3 _cut4}, exponential({lambda=30})) prior({lambda}, uniform(10,40)) block(lambda) --------------------------------------------------------------------------- Setup . sysuse auto, clear . replace weight = weight/1000 . replace length = length/100 . replace mpg = mpg/10

Bayesian multivariate normal model including matrix parameter {Sigma} for the covariance matrix . set seed 14 . bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m})) prior({mpg:_cons} {weight:_cons} {length:_cons}, normal(0,100)) prior({Sigma,m}, iwishart(3,100,I(3))) block({mpg:_cons} {weight:_cons} {length:_cons}) block({Sigma,m}) dots

Request additional burn-in and more frequent adaptation . set seed 14 . bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m})) prior({mpg:_cons} {weight:_cons} {length:_cons}, normal(0,100)) prior({Sigma,m}, iwishart(3,100,I(3))) block({mpg:_cons} {weight:_cons} {length:_cons}) block({Sigma,m}) dots burnin(5000) adaptation(every(50))

Request Gibbs sampling for covariance matrix {Sigma} . set seed 14 . bayesmh (mpg) (weight) (length), likelihood(mvnormal({Sigma,m})) prior({mpg:_cons} {weight:_cons} {length:_cons}, normal(0,100)) prior({Sigma,m}, iwishart(3,100,I(3))) block({mpg:_cons} {weight:_cons} {length:_cons}) block({Sigma,m}, gibbs) dots

--------------------------------------------------------------------------- Setup . webuse pig, clear

Bayesian linear random-intercept model . set seed 14 . fvset base none id . bayesmh weight week i.id, likelihood(normal({var_0})) noconstant prior({weight:i.id}, normal({weight:_cons},{var_id})) prior({weight:_cons}, normal(0, 100)) prior({weight:week}, normal(0, 100)) prior({var_0}, igamma(0.001, 0.001)) prior({var_id}, igamma(0.001, 0.001)) mcmcsize(5000) dots

Bayesian linear random-intercept model using the reffects() option . set seed 14 . bayesmh weight week, reffects(id) likelihood(normal({var_0})) noconstant prior({weight:i.id}, normal({weight:_cons},{var_id})) prior({weight:_cons}, normal(0, 100)) prior({weight:week}, normal(0, 100)) prior({var_0}, igamma(0.001, 0.001)) prior({var_id}, igamma(0.001, 0.001)) mcmcsize(5000) dots

--------------------------------------------------------------------------- Setup . webuse coal

Analysis of a change point problem with target MCMC sample size of 20,000 . set seed 14 . bayesmh count, likelihood(dpoisson({mu1}*sign(year<{cp})+{mu2}*sign(year>={cp})) > ) prior({mu1} {mu2}, flat) prior({cp}, uniform(1851,1962)) initial({mu1} 1 {mu2} 1 {cp} 1906) mcmcsize(20000)

---------------------------------------------------------------------------

Video examples

Introduction to Bayesian statistics, part 1: The basic concepts

Introduction to Bayesian statistics, part 2: MCMC and the Metropolis-Hastings algorithm

Stored results

bayesmh stores the following in e():

Scalars e(N) number of observations e(k) number of parameters e(k_sc) number of scalar parameters e(k_mat) number of matrix parameters e(n_eq) number of equations e(mcmcsize) MCMC sample size e(burnin) number of burn-in iterations e(mcmciter) total number of MCMC iterations e(thinning) thinning interval e(arate) overall AR e(eff_min) minimum efficiency e(eff_avg) average efficiency e(eff_max) maximum efficiency e(clevel) credible interval level e(hpd) 1 if hpd is specified, 0 otherwise e(batch) batch length for batch-mean calculations e(corrlag) maximum autocorrelation lag e(corrtol) autocorrelation tolerance e(dic) deviation information criterion e(lml_lm) log marginal-likelihood using Laplace-Metropolis method e(scale) initial multiplier for scale factor; scale() e(block#_gibbs) 1 if Gibbs sampling is used in #th block, 0 otherwise e(block#_reffects) 1 if the parameters in #th block are random effects, 0 otherwise e(block#_scale) #th block initial multiplier for scale factor e(block#_tarate) #th block target adaptation rate e(block#_arate_last) #th block AR from the last adaptive iteration e(block#_tolerance) #th block adaptation tolerance e(adapt_every) adaptation iterations adaptation(every()) e(adapt_maxiter) maximum number of adaptive iterations adaptation(maxiter()) e(adapt_miniter) minimum number of adaptive iterations adaptation(miniter()) e(adapt_alpha) adaptation parameter adaptation(alpha()) e(adapt_beta) adaptation parameter adaptation(beta()) e(adapt_gamma) adaptation parameter adaptation(gamma()) e(adapt_tolerance) adaptation tolerance adaptation(tolerance()) e(repeat) number of attempts used to find feasible initial values

Macros e(cmd) bayesmh e(cmdline) command as typed e(method) sampling method e(depvars) names of dependent variables e(eqnames) names of equations e(likelihood) likelihood distribution (one equation) e(likelihood#) likelihood distribution for #th equation e(prior) prior distribution e(prior#) prior distribution, if more than one prior() is specified e(priorparams) parameter specification in prior() e(priorparams#) parameter specification from #th prior(), if more than one prior() is specified e(parnames) names of model parameters except exclude() e(postvars) variable names corresponding to model parameters in e(parnames) e(subexpr) substitutable expression e(subexpr#) substitutable expression, if more than one e(wtype) weight type (one equation) e(wtype#) weight type for #th equation e(wexp) weight expression (one equation) e(wexp#) weight expression for #th equation e(block#_names) parameter names from #th block e(exclude) names of excluded parameters e(filename) name of the file with simulation results e(scparams) scalar model parameters e(matparams) matrix model parameters e(pareqmap) model parameters in display order e(title) title in estimation output e(rngstate) random-number state at the time of simulation e(search) on, repeat(), or off

Matrices e(mean) posterior means e(sd) posterior standard deviations e(mcse) MCSE e(median) posterior medians e(cri) credible intervals e(Cov) variance-covariance matrix of parameters e(ess) effective sample sizes e(init) initial values vector

Functions e(sample) mark estimation sample


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index