»  Home »  Products »  Features »  Bayesian multilevel models

## Bayesian multilevel models

### What's this about?

Multilevel models are regression models that incorporate group-specific effects. Groups may represent different levels of hierarchy such as hospitals, doctors nested within hospitals, and patients nested within doctors nested within hospitals. Group-specific effects are assumed to vary randomly across groups according to some a priori distribution, commonly a normal distribution. This assumption makes multilevel models natural candidates for Bayesian analysis. Bayesian multilevel models additionally assume that other model parameters such as regression coefficients and variance components—variances of group-specific effects—are also random.

Why use Bayesian multilevel models? In addition to standard reasons for Bayesian analysis, Bayesian multilevel modeling is often used when the number of groups is small or in the presence of many hierarchical levels. Bayesian information criteria such as deviance information criterion (DIC) are also popular for comparing multilevel models. When the comparison of groups is of main interest, Bayesian multilevel modeling can provide entire distributions of group-specific effects.

You can fit Bayesian multilevel models in Stata and you can do this easily—just prefix your multilevel command with bayes:

. bayes: mixed y x1 x2 || id:


Of course, when we say "easily", we refer to the model specification and not the model formulation. Just like any other modeling task, Bayesian multilevel modeling requires careful consideration.

Continuous, censored, binary, ordinal, count, GLM, and survival outcomes are supported; see the full list of supported multilevel commands. All multilevel features such as multiple levels of hierarchy, nested and crossed random effects, random intercepts and coefficients, and random-effects covariance structures are available. All Bayesian features as provided by the [BAYES] bayesmh command are supported when you use the bayes prefix with multilevel commands; read about general features of the bayes prefix.

### Highlights

• Outcomes: continuous, censored, binary, ordinal, count, GLM, survival
• Random intercepts and coefficients
• Nested and crossed effects
• Multiple levels of hierarchy
• Random-effects covariance structures
• Full Bayesian-features support

### Let's see it work

Now, let's look at several examples.

### Random intercepts

Consider data on math scores of pupils in the third and fifth years from different schools in Inner London (Mortimore et al. 1988). We want to investigate a school effect on math scores.

We can use Stata's mixed command to fit a two-level linear model of five-year math scores (math5) on three-year math scores (math3) with random intercepts for schools.

. mixed math5 math3 || school:

Performing EM optimization ...

Iteration 0:   log likelihood = -2767.8923
Iteration 1:   log likelihood = -2767.8923

Computing standard errors ...

Mixed-effects ML regression                     Number of obs     =        887
Group variable: school                          Number of groups  =         48
Obs per group:
min =          5
avg =       18.5
max =         62
Wald chi2(1)      =     347.92
Log likelihood = -2767.8923                     Prob > chi2       =     0.0000

math5   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

math3     .6088066   .0326392    18.65   0.000     .5448349    .6727783
_cons     30.36495   .3491544    86.97   0.000     29.68062    31.04928

Random-effects parameters      Estimate   Std. err.     [95% conf. interval]

school: Identity
var(_cons)     4.026853   1.189895      2.256545    7.186004

var(Residual)     28.12721    1.37289       25.5611    30.95094
LR test vs. linear model: chibar2(01) = 56.38         Prob >= chibar2 = 0.0000

The likelihood-ratio test at the bottom and the estimate of the school variance component suggest statistically significant variability between schools in the math5 scores after adjusting for the math3 scores.

To fit the corresponding Bayesian model, you can simply prefix the above mixed command with bayes:.

. bayes: mixed math5 math3 || school:


But here, we will first use bayes's melabel option to obtain output similar to that of mixed for easier comparison of the results.

. bayes, melabel: mixed math5 math3 || school:

Bayesian multilevel regression                   MCMC iterations  =     12,500
Metropolis–Hastings and Gibbs sampling           Burn-in          =      2,500
MCMC sample size =     10,000
Group variable: school                           Number of groups =         48
Obs per group:
min =          5
avg =       18.5
max =         62
Number of obs    =        887
Acceptance rate  =      .8091
Efficiency:  min =     .03366
avg =      .3331
Log marginal-likelihood                                       max =      .6298

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
math3    .6087689   .0326552   .000436   .6087444   .5450837   .6729982
_cons    30.39202   .3597873    .01961   30.38687   29.67802   31.10252

school
var(_cons)   4.272626   1.299061   .039697   4.122282   2.247659   7.220809

var(Residual)   28.23014    1.37812   .017365   28.18347   25.63394   31.04375

Note: Default priors are used for model parameters.

The reported estimates of posterior means and posterior standard deviations for model parameters are similar to the corresponding maximum-likelihood estimates and standard errors reported by mixed.

Unlike mixed, which provided one estimate for each model parameter, bayes: mixed provided, for each parameter, a sample of 10,000 Markov chain Monte Carlo (MCMC) estimates from the simulated posterior distribution of the parameters. The reported posterior means and posterior distributions are the corresponding summaries of the marginal posterior distributions of the parameters.

Let's now see the output from bayes: mixed without the melabel option. The output is lengthy, so we will describe it in pieces.

. bayes

Multilevel structure

school
{U0}: random intercepts

Model summary

Likelihood:
math5 ~ normal(xb_math5,{e.math5:sigma2})

Priors:
{math5:math3 _cons} ~ normal(0,10000)                                    (1)
{U0} ~ normal(0,{U0:sigma2})                              (1)
{e.math5:sigma2} ~ igamma(.01,.01)

Hyperprior:
{U0:sigma2} ~ igamma(.01,.01)
(1) Parameters are elements of the linear form xb_math5.

Bayesian multilevel regression                   MCMC iterations  =     12,500
Metropolis–Hastings and Gibbs sampling           Burn-in          =      2,500
MCMC sample size =     10,000
Group variable: school                           Number of groups =         48
Obs per group:
min =          5
avg =       18.5
max =         62
Number of obs    =        887
Acceptance rate  =      .8091
Efficiency:  min =     .03366
avg =      .3331
Log marginal-likelihood                                       max =      .6298


The header now includes additional information about the fitted Bayesian model. We can see, for example, that parameter {U0} represents random intercepts in the model, that regression coefficients {math5:math3} and {math5:_cons} are assigned default normal priors with zero means and variances of 10,000, and that the variance component for schools, {U0:sigma2}, is assigned the default inverse-gamma prior with 0.01 for both the shape and scale parameters.

In the output table, the results are the same, but the parameter labels are different.



Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
math3    .6087689   .0326552   .000436   .6087444   .5450837   .6729982
_cons    30.39202   .3597873    .01961   30.38687   29.67802   31.10252

school
U0:sigma2    4.272626   1.299061   .039697   4.122282   2.247659   7.220809

e.math5
sigma2    28.23014    1.37812   .017365   28.18347   25.63394   31.04375

Note: Default priors are used for model parameters.

By default, bayes: mixed displays results using parameter names as you would use when referring to these parameters in bayes's options or during postestimation. For example, you would use {U0:sigma2} to refer to the variance component for schools and {e.math5:sigma2} to refer to the error variance.

There is still one part of the output missing—the estimates of random intercepts {U0}. In a Bayesian multilevel model, random effects are model parameters just like regression coefficients and variance components. bayes does not report them by default because there are often too many of them. But you can display them during or after estimation. Here we replay the estimation, adding the showreffects() option to display the estimates of the first 12 random intercepts.

. bayes, showreffects({U0[1/12]})

Multilevel structure

school
{U0}: random intercepts

Model summary

Likelihood:
math5 ~ normal(xb_math5,{e.math5:sigma2})

Priors:
{math5:math3 _cons} ~ normal(0,10000)                                    (1)
{U0} ~ normal(0,{U0:sigma2})                              (1)
{e.math5:sigma2} ~ igamma(.01,.01)

Hyperprior:
{U0:sigma2} ~ igamma(.01,.01)
(1) Parameters are elements of the linear form xb_math5.

Bayesian multilevel regression                   MCMC iterations  =     12,500
Metropolis–Hastings and Gibbs sampling           Burn-in          =      2,500
MCMC sample size =     10,000
Group variable: school                           Number of groups =         48
Obs per group:
min =          5
avg =       18.5
max =         62
Number of obs    =        887
Acceptance rate  =      .8091
Efficiency:  min =     .03366
avg =      .1749
Log marginal-likelihood                                       max =      .6298

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
math3    .6087689   .0326552   .000436   .6087444   .5450837   .6729982
_cons    30.39202   .3597873    .01961   30.38687   29.67802   31.10252

U0[school]
1   -2.685824   .9776969   .031227  -2.672364  -4.633162  -.7837494
2     .015465   1.290535    .03201   .0041493  -2.560203   2.556316
3    1.049006   1.401383   .033731   1.021202  -1.534088    3.84523
4   -2.123055   .9921679   .028859  -2.144939  -4.069283  -.1507593
5   -.1504003   .9650027   .033881  -.1468966  -2.093015   1.721503
6    .5833945   1.192379   .032408   .5918357  -1.660335   3.049718
7    1.490231   1.332917   .033846   1.481793  -1.095757   4.272903
8    .4198105   .9783772   .031891   .4579817  -1.496317   2.403908
9   -1.996105    1.02632   .035372  -2.001467  -4.037044  -.0296276
10    .6736806   1.249238   .031114    .660939   -1.70319   3.179273
11   -.5650109   .9926453   .031783  -.5839293  -2.646413   1.300388
12   -.3620733   1.090265   .033474  -.3203626  -2.550097   1.717532

school
U0:sigma2    4.272626   1.299061   .039697   4.122282   2.247659   7.220809

e.math5
sigma2    28.23014    1.37812   .017365   28.18347   25.63394   31.04375

Note: Default priors are used for model parameters.

We could have used showreffects to display all 48.

To compare schools, we can plot posterior distributions (our prior normal distributions updated based on the observed data) of random intercepts. We plot histograms for the same first 12 random intercepts.

. bayesgraph histogram {U0[1/12]}, byparm


We save the MCMC results and store the estimation results from our Bayesian random-intercept model for later model comparison.

. bayes, saving(ri_mcmc)

. estimates store ri


### Custom priors

Our initial example used the default priors for model parameters. Just to show you how easy it is to specify custom priors, we specify a uniform on (-50,50) prior for the regression coefficients:

. bayes, prior({math5:}, uniform(-50,50)): mixed math5 math3 || school:
note: Gibbs sampling is used for variance components.

Burn-in 2500 aaaaaaaaa1000aaaaaaaaa2000aaaaa done
Simulation 10000 .........1000.........2000.........3000.........4000.........5
> 000.........6000.........7000.........8000.........9000.........10000 done

Multilevel structure

school
{U0}: random intercepts

Model summary

Likelihood:
math5 ~ normal(xb_math5,{e.math5:sigma2})

Priors:
{math5:math3 _cons} ~ uniform(-50,50)                                    (1)
{U0} ~ normal(0,{U0:sigma2})                              (1)
{e.math5:sigma2} ~ igamma(.01,.01)

Hyperprior:
{U0:sigma2} ~ igamma(.01,.01)
(1) Parameters are elements of the linear form xb_math5.

Bayesian multilevel regression                   MCMC iterations  =     12,500
Metropolis–Hastings and Gibbs sampling           Burn-in          =      2,500
MCMC sample size =     10,000
Group variable: school                           Number of groups =         48
Obs per group:
min =          5
avg =       18.5
max =         62
Number of obs    =        887
Acceptance rate  =      .5931
Efficiency:  min =     .02217
avg =      .2153
Log marginal-likelihood                                       max =      .6877

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
math3    .6094181   .0319517   .001432   .6085484   .5460873   .6732493
_cons    30.36818   .3290651   .022103   30.38259   29.73806    31.0131

school
U0:sigma2    4.261459   1.282453   .040219   4.084322   2.238583   7.218895

e.math5
sigma2    28.24094   1.374732   .016577   28.20275   25.68069   31.01401

Note: Default priors are used for some model parameters.

{math5:} in the above is a shortcut for referring to all regression coefficients associated with the dependent variable math5: {math5:math3} and {math5:_cons}.

Similarly, we can use different priors for each regression coefficient, but we omit the output for brevity.

. bayes, prior({math5:_cons}, uniform(-50,50))
> prior({math5:math3}, uniform(-5,5)):
> mixed math5 math3 || school:


### Random coefficients

Let's extend our simple random-intercept model to include a random coefficient.

Following mixed's specification, we include math3 in the random-effects equation for the school level. We also store our Bayesian estimation results for later comparison. This time, we save MCMC results during estimation.

. bayes, saving(rc_mcmc): mixed math5 math3 || school: math3

. estimates store rc


Here is abbreviated output from bayes: mixed, including a random coefficient for math3.


Multilevel structure

school
{U0}: random intercepts
{U1}: random coefficients for math3

Model summary

Likelihood:
math5 ~ normal(xb_math5,{e.math5:sigma2})

Priors:
{math5:math3 _cons} ~ normal(0,10000)                                    (1)
{U0} ~ normal(0,{U0:sigma2})                              (1)
{U1} ~ normal(0,{U1:sigma2})                              (1)
{e.math5:sigma2} ~ igamma(.01,.01)

Hyperprior:
{U0:sigma2} ~ igamma(.01,.01)
{U1:sigma2} ~ igamma(.01,.01)
(1) Parameters are elements of the linear form xb_math5.

some output omitted

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
math3    .6143538   .0454835   .001655   .6137192   .5257402   .7036098
_cons    30.38813   .3577296   .019669    30.3826   29.71581   31.10304

school
U0:sigma2    4.551927   1.368582   .041578   4.361247   2.420075   7.722063
U1:sigma2    .0398006   .0194373   .001271   .0363514   .0131232   .0881936

e.math5
sigma2    27.19758   1.354024   .021967   27.15869   24.71813   30.05862

Note: Default priors are used for model parameters.

mixed assumes independence between random intercepts and coefficients. bayes: mixed does too, to be consistent. We can relax this assumption by specifying an unstructured variance–covariance as follows. We save the MCMC results and store the estimation results from this model as well.

. bayes, saving(rcun_mcmc): mixed math5 math3 || school: math3,
> covariance(unstructured)

. estimates store rcun


Here is abbreviated output from bayes: mixed specifying the unstructured covariance.


Multilevel structure

school
{U0}: random intercepts
{U1}: random coefficients for math3

Model summary

Likelihood:
math5 ~ normal(xb_math5,{e.math5:sigma2})

Priors:
{math5:math3 _cons} ~ normal(0,10000)                                    (1)
{U0 U1} ~ mvnormal(2,{U:Sigma,m})                            (1)
{e.math5:sigma2} ~ igamma(.01,.01)

Hyperprior:
{U:Sigma,m} ~ iwishart(2,3,I(2))
(1) Parameters are elements of the linear form xb_math5.

some output omitted

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
math3    .6234197   .0570746   .002699   .6228624   .5144913   .7365849
_cons    30.34691   .3658515   .021356   30.34399   29.62991   31.07312

school
U:Sigma_1_1    4.527905   1.363492   .046275   4.345457   2.391319   7.765521
U:Sigma_2_1    -.322247   .1510543   .004913  -.3055407  -.6683891  -.0679181
U:Sigma_2_2    .0983104   .0280508   .000728   .0941222   .0556011   .1649121

e.math5
sigma2     26.8091    1.34032   .018382   26.76549   24.27881   29.53601

Note: Default priors are used for model parameters.

### Model comparison using DIC

We now use DIC to compare the three models: random-intercept model, random-coefficient model with independent covariance structure, and random-coefficient model with unstructured covariance structure.

. bayesstats ic ri rc rcun, diconly

Deviance information criterion

DIC

ri    5514.514
rc    5500.837
rcun    5494.349



DIC is the smallest for the random-coefficient model with an unstructured random-effects covariance, so this model is preferable.

### Custom priors for matrices

To demonstrate that you can specify custom priors for not only scalar parameters but also matrices, we specify a custom inverse-Wishart prior for the random-effects covariance matrix {Sigma,m}, which is short for {Sigma,matrix}.

. matrix S = (5,-0.5\-0.5,1)

. bayes, iwishartprior(10 S): mixed math5 math3 || school: math3,
> covariance(unstructured)
note: Gibbs sampling is used for regression coefficients and variance
components.

Burn-in 2500 aaaaaaaaa1000aaaaaaaaa2000aaaaa done
Simulation 10000 .........1000.........2000.........3000.........4000.........5
> 000.........6000.........7000.........8000.........9000.........10000 done

Multilevel structure

school
{U0}: random intercepts
{U1}: random coefficients for math3

Model summary

Likelihood:
math5 ~ normal(xb_math5,{e.math5:sigma2})

Priors:
{math5:math3 _cons} ~ normal(0,10000)                                    (1)
{U0 U1} ~ mvnormal(2,{U:Sigma,m})                            (1)
{e.math5:sigma2} ~ igamma(.01,.01)

Hyperprior:
{U:Sigma,m} ~ iwishart(2,10,S)
(1) Parameters are elements of the linear form xb_math5.

some output omitted

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
math3    .6130199   .0537473    .00282    .613916   .5058735   .7180286
_cons     30.3789   .3223274   .016546    30.3816   29.74903   31.02091

school
U:Sigma_1_1    3.482914   1.104742   .048864   3.344148   1.770735     6.0136
U:Sigma_2_1   -.2712029   .1169666   .004214  -.2596221  -.5337747  -.0745626
U:Sigma_2_2    .0775669   .0210763   .000651    .074876   .0443026   .1264642

e.math5
sigma2    26.94206   1.342571   .022106   26.90405    24.4033   29.66083

Note: Default priors are used for model parameters.

The iwishartprior() option overrides the parameters of the default inverse-Wishart prior distributions used by bayes:. In our example, this prior is used for the covariance matrix with the default arguments of 3 degrees of freedom and an identity scale matrix. In the above example, we instead used 10 degrees of freedom and the scale matrix S.

### Exponential survival model

Consider survival data that record durations (in months) of employment of individuals from tstart to tend. Variable failure contains 1 when the end date corresponds to the end of employment, and 0 otherwise. A two-level model would account for the variability between individuals, who are identified by the id variable. To demonstrate a three-level model, let's also account for the variability between birth years of individuals, which may help explain some of between-individual variability.

Here is how we would proceed with the standard multilevel analysis for these data.

. stset tend, origin(tstart) failure(failure)

. mestreg education njobs prestige i.female || birthyear: || id:, distribution(exponential)


We model the time to the end of employment as a function of the education level, the number of jobs held previously, the prestige of the current job, and gender.

In our Bayesian analysis, we will compare how well the two survival models, exponential and lognormal, fit these data.

First, we fit a Bayesian three-level exponential model by simply prefixing the above mestreg command.

. bayes: mestreg education njobs prestige i.female || birthyear: || id:,
>   distribution(exponential)


The output is lengthy, so we describe it in parts.

The multilevel summary provides the names of parameters, {U0} and {UU0}, for random intercepts at the third and second levels of hierarchy, respectively. (Observations compose the first level.)


Multilevel structure

birthyear
{U0}: random intercepts

birthyear>id
{UU0}: random intercepts



The model summary describes the likelihood model and prior distributions used. For example, both variance components, {U0:sigma2} and {UU0:sigma2}, are assigned inverse-gamma prior distributions with scale and shape parameters of 0.01.


Model summary

Likelihood:
_t ~ mestreg(xb__t)

Priors:
{_t:education njobs prestige 1.female _cons} ~ normal(0,10000)           (1)
{U0} ~ normal(0,{U0:sigma2})     (1)
{UU0} ~ normal(0,{UU0:sigma2})    (1)

Hyperpriors:
{U0:sigma2} ~ igamma(.01,.01)
{UU0:sigma2} ~ igamma(.01,.01)

(1) Parameters are elements of the linear form xb__t.

The header information now includes a group summary for each hierarchical level.

Bayesian multilevel exponential PH regression    MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000

No. of       Observations per group
Group variable       groups    Minimum    Average    Maximum

birthyear          12          3       50.0         99
id         201          1        3.0          9

Number of obs    =        600
Acceptance rate  =      .3045
Efficiency:  min =    .008093
avg =     .01554
Log marginal-likelihood                                       max =     .02482



Just like mestreg, bayes: mestreg by default reports hazard ratios for the exponential survival model. You can use the nohr option to instead obtain coefficients. You can specify this option with bayes or mestreg during estimation or with bayes on replay.

. bayes, nohr

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

-t
education    .1008933   .0338232   .003186   .0995609   .0383122   .1732759
njobs    -.248267   .0477605   .003938  -.2493944  -.3383752  -.1606085
prestige   -.0296164    .006582   .000466  -.0294993  -.0428695  -.0171793
1.female    .4593769   .1527497   .025605   .4589502   .1445377   .7329368
_cons   -4.381721   .3651954   .045888  -4.356305  -5.129819  -3.735914

birthyear
U0:sigma2    .1177088   .1230839   .014551   .0834062   .0147882   .4957971

birthyear>id
UU0:sigma2    .5227624     .12143   .009875   .5153562   .3075033    .784219



We save the MCMC results and store the estimation results from our Bayesian exponential survival model for later model comparison. We also specify the remargl option to compute the log marginal-likelihood, which we explain below.

. bayes, remargl saving(streg_exp_mcmc)

. estimates store streg_exp


### Log marginal-likelihood

Previously, we used bayes: mestreg to fit a Bayesian exponential survival model to employment duration data. If you look closely at the header output from bayes: mestreg,

Bayesian multilevel exponential PH regression    MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Number of obs    =        600
Acceptance rate  =      .3045
Efficiency:  min =    .008093
avg =     .01554
Log marginal-likelihood                                       max =     .02482


you will notice that no value is reported for the log marginal-likelihood (LML). This is intentional. As we mentioned earlier, Bayesian multilevel models treat random effects as parameters and thus may contain many model parameters. For models with many parameters or high-dimensional models, the computation of LML can be time consuming, and its accuracy may become unacceptably low. For these reasons, the bayes prefix does not compute the LML for multilevel models by default. You can specify the remargl option during estimation or on replay to compute it.

Here is a subset of the relevant output after typing

. bayes, remargl

Bayesian multilevel exponential PH regression    MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Number of obs    =        600
Acceptance rate  =      .3045
Efficiency:  min =    .008093
avg =     .01554
Log marginal-likelihood = -2530.4131                          max =     .02482


Notice that the LML value is now reported in the header output.

Why do we need the value of LML? Some of the Bayesian summaries used for model comparison such as Bayes factors and model posterior probabilities are based on LML. In this example, we want to demonstrate the use of model posterior probabilities to compare Bayesian models, and so we needed to compute LML.

### Lognormal survival model

We now fit a Bayesian three-level lognormal model by specifying the corresponding distribution in the distribution() option. We also specify the remargl and saving() options during estimation this time.

. bayes, remargl saving(streg_lnorm_mcmc):
>   mestreg education njobs prestige i.female || birthyear: || id:,
>   distribution(lognormal)

Bayesian multilevel lognormal AFT regression     MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Number of obs    =        600
Acceptance rate  =      .3714
Efficiency:  min =    .004755
avg =     .01522
Log marginal-likelihood = -2496.4991                          max =     .04602

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

-t
education   -.0951555   .0314752   .002934  -.0947659  -.1563553  -.0348788
njobs    .1733276   .0423128   .004265   .1745013   .0871177   .2545806
prestige    .0272722   .0056387   .000263   .0272781   .0160048   .0384117
1.female   -.3406851   .1323801   .018735  -.3383823  -.5965842  -.0912631
_cons    3.822971   .4130557   .052204   3.817862   3.027644   4.635152

lnsigma    .0723268   .0419609   .002715   .0722478  -.0100963   .1569296

birthyear
U0:sigma2    .1669752   .1566106   .013008    .120028   .0252243   .6019641

birthyear>id
UU0:sigma2    .3061048   .1067159   .015475   .2919266   .1357847   .5485671

Note: Default priors are used for model parameters.
Note: Adaptation tolerance is not met in at least one of the blocks.

. estimates store streg_lnorm

### Model comparison using posterior probabilities

We now compare models using model posterior probabilities.

. bayestest model streg_exp streg_lnorm

Bayesian model tests

log(ML)       P(M)     P(M|y)

streg_exp   -2.53e+03     0.5000     0.0000
streg_lnorm   -2.50e+03     0.5000     1.0000

Note: Marginal likelihood (ML) is computed using
Laplace-Metropolis approximation.

The posterior probability for the lognormal model is essentially 1, so it is preferable to the exponential model for these data.

### Crossed-effects logistic model

In example 5 of [ME] melogit, we fit a crossed-effects model to the data from a study measuring students' attainment scores in primary and secondary schools from Fife, Scotland. The binary outcome of interest is whether the score is greater than 6. The model includes sex as the covariate and the effects of primary and secondary schools as crossed random effects.

. melogit attain_gt_6 sex || _all: R.sid || pid:


We fit the corresponding Bayesian crossed-effects model by simply prefixing the above melogit command with bayes:

. bayes: melogit attain_gt_6 sex || _all: R.sid || pid:


The output is lengthy, so as before, we describe it in parts.


Multilevel structure

sid
{U0}: random intercepts
pid
{V0}: random intercepts


We have two sets of random intercepts, {U0} and {V0}, at the secondary sid and primary pid levels, respectively.


Model summary

Likelihood:
attain_gt_6 ~ melogit(xb_attain_gt_6)

Priors:
{attain_gt_6:sex _cons} ~ normal(0,10000)                                (1)
{U0} ~ normal(0,{U0:sigma2})                          (1)
{V0} ~ normal(0,{V0:sigma2})                          (1)

Hyperpriors:
{U0:sigma2} ~ igamma(.01,.01)
{V0:sigma2} ~ igamma(.01,.01)
(1) Parameters are elements of the linear form xb_attain_gt_6.

The likelihood model is a multilevel logistic model. The prior distributions are normal for regression coefficients and random intercepts and are inverse-gamma for the variance components.



No. of       Observations per group
Group variable       groups    Minimum    Average    Maximum

_all           1      3,435    3,435.0      3,435
pid          148          1       23.2         72

Family: Bernoulli                                Number of obs    =      3,435
Link  : logit                                    Acceptance rate  =      .3086
Efficiency:  min =     .01048
avg =      .0245
Log marginal-likelihood                                       max =     .03698


The header information includes the MCMC simulation summary as well as other details about the fitted Bayesian model.



Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

attain_gt_6
sex    .2898934   .0731912   .004749   .2879629   .1568701   .4429235
_cons   -.6502969    .109456   .010693  -.6497501  -.8509903  -.4300285

sid
U0:sigma2    .1484609   .0826556   .005049   .1336719   .0241995   .3532284

pid
V0:sigma2    .4777177   .0954019   .004961      .4701   .3228973   .6864048

Note: Default priors are used for model parameters.

The results suggest that both primary and secondary schools contribute to the variation in whether the attainment score is greater than 6 after adjusting for sex in the model.

Instead of the estimates of coefficients, we can obtain the estimates of odds ratios.

. bayes, or

Equal-tailed
Odds ratio   Std. dev.     MCSE     Median  [95% cred. interval]

attain_gt_6
sex    1.339886   .0988964    .00647   1.333708   1.169844   1.557253
_cons    .5250545   .0584524   .005902   .5221763   .4269919   .6504906

sid
U0:sigma2    .1484609   .0826556   .005049   .1336719   .0241995   .3532284

pid
V0:sigma2    .4777177   .0954019   .004961      .4701   .3228973   .6864048
Note: Estimates are transformed only in the first equation to odds ratios.
Note: _cons estimates baseline odds (conditional on zero random effects).
Note: Default priors are used for model parameters.

We could have also specified the or option during estimation either with the bayes prefix

. bayes, or: melogit attain_gt_6 sex || _all: R.sid || pid:


or with melogit

. bayes: melogit attain_gt_6 sex || _all: R.sid || pid:, or


### Reference

Mortimore, P., P. Sammons, L. Stoll, D. Lewis, and R. Ecob. 1988. School Matters: The Junior Years. Wells, Somerset, UK: Open Books.

### Tell me more

Learn more about the general features of the bayes prefix.

Read more about the bayes prefix and Bayesian analysis in the Stata Bayesian Analysis Reference Manual.