Home  /  Products  /  Features  /  Bayesian analysis: Multiple Markov chains

### <-  See Stata's other features

#### Highlights

• nchains() option for simulating multiple chains with bayes: and bayesmh

• Use default or specify your own initial values for each chain

• Posterior summaries, graphical diagnostics, and more using multiple chains

• Gelman–Rubin convergence diagnostics

Markov chain Monte Carlo (MCMC) is used for Bayesian inference. Has the MCMC converged? Has it fully explored the target posterior distribution? Or do you need more simulations? You need to answer these questions for proper inference. Stata's nchains() can help you with that.

Type

. bayes, nchains(4): regress y x1 x2


and four chains will be produced, in this case for Bayesian regression of y on x1 and x2. The chains will be compared and the maximum Gelman–Rubin convergence diagnostic reported. And the chains will be combined to produce a more accurate final result. The intent is that you check the reported diagnostic before interpreting the results.

nchains() may be used with the bayes: or bayesmh command.

For convergence diagnostic purposes, it is important that different chains have dispersed initial values for the model parameters. Default initial values are provided, but they may not be sufficient for all cases. You can use initall() and init#() to specify your own initial values for all chains or for specific chains. See Convergence diagnostics using multiple chains for details.

All existing Bayesian postestimation commands such as bayesstats, bayestest, and bayesgraph provide support for multiple chains via the chains() and sepchains options.

Also see the bayesstats grubin command, which computes Gelman–Rubin convergence diagnostics for all model parameters, and How to run multiple chains in parallel?

### Let's see it work

Consider the diabetes data (Efron et al. 2004) on 442 diabetes patients that record a measure of disease progression as an outcome y. To demonstrate multiple chains, we consider a small subset of variables of interest: age, sex, body mass index, and mean arterial pressure.

Multiple chains serve multiple purposes. More commonly, they are used to explore MCMC convergence. But they are also useful for obtaining more precise estimates of parameters. And you can run them in parallel to save time.

#### Using multiple chains to explore convergence visually

Let's fit a Bayesian linear regression and produce three multiple chains.

. bayes, nchains(3) rseed(16): regress y age sex bmi map

Chain 1
Burn-in ...
Simulation ...

Chain 2
Burn-in ...
Simulation ...

Chain 3
Burn-in ...
Simulation ...

Model summary

Likelihood:
y ~ regress(xb_y,{sigma2})

Priors:
{y:age sex bmi map _cons} ~ normal(0,10000)                              (1)
{sigma2} ~ igamma(.01,.01)

(1) Parameters are elements of the linear form xb_y.

Bayesian linear regression                    Number of chains    =          3
Random-walk Metropolis-Hastings sampling      Per MCMC chain:
Iterations      =     12,500
Burn-in         =      2,500
Sample size     =     10,000
Number of obs       =        442
Avg acceptance rate =      .3274
Avg efficiency: min =     .04375
avg =     .07228
max =      .1859
Avg log marginal-likelihood =  -2478.936      Max Gelman-Rubin Rc =      1.002

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

y
age    64.63216    53.6199   1.48006   64.57528  -40.40117   170.1121
sex   -56.67161   52.83794   1.25481  -55.63422  -158.4018   45.66125
bmi    591.5728    55.3663   1.49183   592.1313   483.4726   699.7989
map    345.2107   56.15044   1.45502   344.3014    233.006   455.7745
_cons     152.063   2.876962   .074766   152.1028   146.3381   157.6296

sigma2    3722.503   259.0622   3.46912   3709.148    3245.06   4254.987

Note: Default priors are used for model parameters.
Note: Default initial values are used for multiple chains.


We demonstrate the use of multiple chains to explore MCMC convergence in detail in Gelman–Rubin convergence diagnostic. Here we will just point out that the maximum Gelman–Rubin diagnostic, reported as Max Gelman-Rubin Rc in the header, is 1.002, which is less than 1.1 and thus does not indicate convergence problems for our model.

We can also explore convergence visually by looking at a trace plot, for instance, for the bmi coefficient.

. bayesgraph trace {y:bmi}


With multiple chains, bayesgraph automatically plots multiple chains on one graph. The trace plot does not indicate any convergence problems. All chains look similar and oscillate around similar point estimates.

We can explore more diagnostics.

. bayesgraph diagnostics {y:bmi}


The autocorrelation and distribution plots look similar for all chains. The autocorrelation dies off fairly quickly in all three chains.

In this demonstration, we are satisfied with our visual exploration of MCMC convergence. In actual analysis, we would want to perform the above steps for all model parameters and explore the convergence more formally.

#### Posterior summaries using multiple chains

After verifying MCMC convergence, as in Using multiple chains to explore convergence visually and Using multiple chains to explore convergence formally, we can perform inference.

In the presence of multiple chains, posterior summaries are computed by using all chains, three in our example:

. bayesstats summary

Posterior summary statistics                      Number of chains =         3
MCMC sample size =    30,000

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

y
age    64.63216    53.6199   1.48006   64.57528  -40.40117   170.1121
sex   -56.67161   52.83794   1.25481  -55.63422  -158.4018   45.66125
bmi    591.5728    55.3663   1.49183   592.1313   483.4726   699.7989
map    345.2107   56.15044   1.45502   344.3014    233.006   455.7745
_cons     152.063   2.876962   .074766   152.1028   146.3381   157.6296

sigma2    3722.503   259.0622   3.46912   3709.148    3245.06   4254.987



Using all chains provides more precise estimates of model parameters. For instance, we can compare the results above with the results from only the first chain:

. bayesstats summary, chains(1)

Posterior summary statistics                      Number of chains =         1
Chain 1                                           MCMC sample size =    10,000

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

y
age    65.38379   54.08876   2.18726   65.78475  -42.80366   169.0617
sex   -57.99493   54.04443   2.46374  -57.67259  -158.3685   47.19167
bmi    590.8162   55.57828   2.90659   590.5812   483.4726   700.9136
map    346.9589   56.02036   2.80642   344.5968   238.6808   460.3592
_cons    152.1605   2.880091   .136456   152.2867   146.5691   157.6759

sigma2    3726.441   262.7137   6.34486   3720.178   3249.068   4262.679



The posterior mean and standard deviations are similar, but the MCSEs are lower for the results using all three chains because there are more MCMC estimates and they are less correlated.

We can even compare the results from all chains.

. bayesstats summary, sepchains

Posterior summary statistics

Chain 1                                           MCMC sample size =    10,000

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

y
age    65.38379   54.08876   2.18726   65.78475  -42.80366   169.0617
sex   -57.99493   54.04443   2.46374  -57.67259  -158.3685   47.19167
bmi    590.8162   55.57828   2.90659   590.5812   483.4726   700.9136
map    346.9589   56.02036   2.80642   344.5968   238.6808   460.3592
_cons    152.1605   2.880091   .136456   152.2867   146.5691   157.6759

sigma2    3726.441   262.7137   6.34486   3720.178   3249.068   4262.679

Chain 2                                           MCMC sample size =    10,000

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

y
age    61.57452   52.64732   4.16571   60.93262  -41.98606   165.0094
sex    -56.0638   53.11474   1.99601   -53.1213  -155.3794   45.02055
bmi    592.0209   55.81173   2.41218   591.3277   482.1642   699.0798
map    345.6178    57.0509     2.537    344.935    235.805   460.1346
_cons    152.1002    2.82684    .13415   152.0675   146.5058   157.6649

sigma2    3720.897    257.873   5.96029   3706.499   3240.245   4253.801

Chain 3                                           MCMC sample size =    10,000

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

y
age    66.93817   53.90694   2.31714   67.29965  -35.94106   174.1906
sex   -55.95611   51.28798   2.12267  -54.78232  -159.7941   45.54302
bmi    591.8815   54.69921   2.50606    593.381   484.8148   698.1792
map    343.0555   55.26911   2.28489   343.8618   224.2158   449.4169
_cons    151.9282   2.916109   .119936    151.953   146.1003   157.5059

sigma2    3720.172   256.5298   5.75014   3703.189   3244.123   4247.216



Multiple chains may be particularly useful for parameters with low sampling efficiencies.

Just like bayesstats summary, all other postestimation commands recognize the presence of multiple chains. For instance, suppose we would like to test whether the coefficient for bmi is larger than 600. We can use bayestest interval to compute the posterior probability that {y:bmi}>600.

. bayestest interval {y:bmi}, lower(600)

Interval tests     Number of chains =         3
MCMC sample size =    30,000

prob1 : {y:bmi} > 600

Mean    Std. dev.      MCSE

prob1       .4436     0.49684   .0124317



The estimated posterior probability is about 44%.

### Reference

Efron, B., T. Hastie, I. Johnstone, and R. Tibshirani. 2004. Least angle regression. The Annals of Statistics 32: 407–499.

### Tell me more

Learn about Bayesian lasso modeling.

Read more about Multiple chains and Bayesian analysis in the Stata Bayesian Analysis Reference Manual.