Fitting Bayesian regression models can be just as intuitive as performing Bayesian inference—introducing the bayes prefix in Stata. The bayes prefix combines Bayesian features with Stata's intuitive and elegant specification of regression models. It lets you fit Bayesian regression models more easily and fit more models.
You fit linear regression by using
. regress y x1 x2
You can fit Bayesian linear regression by simply using
. bayes: regress y x1 x2
You can also fit a Bayesian survival model by simply using
. bayes: streg x1 x2, distribution(weibull)
You can use the bayes prefix with many more regression models, including logistic, ordered probit, multinomial logistic, Poisson, generalized linear, conditional logistic, zero-inflated, sample-selection, and more. See the full list of supported estimation commands. Multilevel models are among the supported models too! See Bayesian multilevel models for details.
All of Stata's Bayesian features are supported by the bayes prefix. You can select from many prior distributions for model parameters or use default priors or even define your own priors. You can use the default adaptive Metropolis–Hastings sampling, or Gibbs sampling, or a combination of the two sampling methods. And you can use any other feature included in the bayesmh command ([BAYES] bayesmh).
After estimation, you can use Stata's standard Bayesian postestimation tools such as
Let's now see the bayes prefix in action.
Consider data on math scores of pupils in the third and fifth years from different schools in Inner London (Mortimore et al. 1988). We want to fit a linear regression of five-year math scores (math5) on three-year math scores (math3). Here we ignore potential school effects; see Bayesian two-level models for analyses that account for them.
We fit a linear regression by typing
. regress math5 math3
Source | SS df MS | Number of obs = 887 | |
F(1, 885) = 341.40 | |||
Model | 10960.2737 1 10960.2737 | Prob > F = 0.0000 | |
Residual | 28411.6181 885 32.1035233 | R-squared = 0.2784 | |
Total | 39371.8918 886 44.4378011 | Root MSE = 5.666 |
math5 | Coef. Std. Err. t P>|t| [95% Conf. Interval] | |
math3 | .6081306 .0329126 18.48 0.000 .5435347 .6727265 | |
_cons | 30.34656 .1906157 159.20 0.000 29.97245 30.72067 | |
To fit a Bayesian linear regression, we simply prefix the above regress command with bayes:.
. bayes: regress math5 math3 Burn-in ... Simulation ...
Model summary | ||
Likelihood: | ||
math5 ~ regress(xb_math5,{sigma2}) | ||
Priors: | ||
{math5:math3 _cons} ~ normal(0,10000) (1) | ||
{sigma2} ~ igamma(.01,.01) | ||
(1) Parameters are elements of the linear form xb_math5. |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
math5 | ||
math3 | .6070097 .0323707 .000976 .6060445 .5440594 .6706959 | |
_cons | 30.3462 .1903067 .005658 30.34904 29.97555 30.71209 | |
sigma2 | 32.17492 1.538155 .031688 32.0985 29.3045 35.38031 | |
The results are similar between the two commands, but the interpretations are different. Recall that Bayesian models assume that all parameters are random. So a 95% credible interval of [0.54,0.67] for the slope coefficient on third-year math scores {math5:math3} implies that the probability that {math5:math3} is between 0.54 and 0.67 is 0.95. Such probabilistic interpretation is not appropriate for the confidence interval from regress, although the confidence interval does suggest a similar plausible range for {math5:math3}.
Also, for each parameter, regress produced just one estimate, whereas bayes: regress produced 10,000 MCMC estimates. MCMC estimates are simulated from a posterior distribution of model parameters, after discarding the first 2,500 estimates for the burn-in period. What is reported in the output are the summaries of the marginal posterior distributions of the parameters such as posterior means and posterior standard deviations.
The results will not always be similar between the two commands. They are similar in this example because the default priors the bayes prefix used are not informative for these data. The priors used are described in the model summary. They are normal with mean 0 and variance of 10,000 for the regression coefficients and inverse-gamma with shape and scale parameters of 0.01 for the error variance. Prior distributions are important for Bayesian models, and we will discuss them later in Custom priors.
We can use any feature of the bayesmh command, which fits general Bayesian models, with the bayes prefix. For example, we can display 90% highest posterior density (HPD) credible intervals instead of the default equal-tailed intervals on replay. We do this with bayes just like we do this with bayesmh:
. bayes, hpd clevel(90) Bayesian linear regression MCMC iterations = 12,500 Random-walk Metropolis-Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 887 Acceptance rate = .3312 Efficiency: min = .1099 avg = .1529 Log marginal likelihood = -2817.2335 max = .2356
HPD | ||
Mean Std. Dev. MCSE Median [90% Cred. Interval] | ||
math5 | ||
math3 | .6070097 .0323707 .000976 .6060445 .5580554 .6634722 | |
_cons | 30.3462 .1903067 .005658 30.34904 30.05275 30.66751 | |
sigma2 | 32.17492 1.538155 .031688 32.0985 29.6363 34.66541 | |
We can use any of Bayesian postestimation features (except bayespredict) after the bayes prefix.
For example, we can check MCMC convergence for {math5:math3} using bayesgraph diagnostics ([BAYES] bayesgraph).
. bayesgraph diagnostics {math5:math3}
Or we can test interval hypotheses using bayestest interval ([BAYES] bayestest interval):
. bayestest interval {math5:math3}, lower(0.5) upper(0.6) Interval tests MCMC sample size = 10,000 prob1 : 0.5 < {math5:math3} < 0.6
Mean Std. Dev. MCSE | ||
prob1 | .42 0.49358 .0160684 | |
The probability that the coefficient {math5:math3} lies between 0.5 and 0.6 is .42.
Also see Multiple chains and Gelman-Rubin convergence diagnostic for investigating convergence using multiple chains.
The default sampling algorithm used by the bayes prefix with most of the estimation commands is an adaptive Metropolis–Hastings algorithm. For some prior and likelihood model combinations, an alternative Gibbs sampling algorithm may be available. Gibbs sampling is generally slower but more efficient than the Metropolis–Hastings sampling. By "more efficient", we mean that you will get more precise estimates from Gibbs sampling for the same number of iterations.
We can specify the gibbs option with bayes for linear regression to perform Gibbs sampling.
. bayes, gibbs: regress math5 math3 Bayesian linear regression MCMC iterations = 12,500 Gibbs sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 887 Acceptance rate = 1 Efficiency: min = 1 avg = 1 Log marginal likelihood = -2817.184 max = 1
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
math5 | ||
math3 | .6085104 .0333499 .000333 .6087819 .5426468 .6731657 | |
_cons | 30.34419 .1916673 .001917 30.34441 29.97587 30.72617 | |
sigma2 | 32.16765 1.551119 .015511 32.10778 29.238 35.29901 | |
Notice that the minimum and maximum efficiencies reported in the header are equal to 1 for Gibbs sampling and are much higher than those from the adaptive Metropolis–Hastings sampling. Higher efficiencies mean smaller Monte Carlo standard errors (MCSEs) and thus more precise posterior mean estimates. In our example, there is no practical difference between the two sampling procedures.
bayes's gibbs option specifies full Gibbs sampling in which all model parameters are sampled using the Gibbs method. Full Gibbs sampling is not available for all likelihood models and prior combinations. In fact, only bayes: regress and bayes: mvreg support this option. However, Gibbs sampling may be available for some model parameters, in which case you can use a hybrid sampling in which some parameters are sampled using the Gibbs method and the others using the Metropolis–Hastings method; see Gibbs and hybrid MH sampling in [BAYES] bayesmh for details.
As we mentioned earlier, a prior distribution is an important component of a Bayesian model and may have a large impact on your results. It summarizes your knowledge about a model parameter before you see the data. After the data are observed, the prior distribution is updated with the information from the observed data to form the posterior distribution of the parameter. The posterior distribution is then used for Bayesian inference.
Your science or previous studies will often suggest the appropriate prior distributions for your model parameters. In the absence of such knowledge, noninformative priors are recommended. The choice of a prior distribution becomes especially important with small datasets, which typically contain less information about the model parameters.
For convenience, the bayes prefix provides default priors, but you can also specify your own. The default priors are chosen to be fairly noninformative for moderately scaled model parameters but may become informative for large values of parameters.
The bayes prefix provides convenient options to modify the values of the default priors. For example, bayes: regress uses normal priors with mean 0 and variance of 10,000 for the regression coefficients and an inverse-gamma prior with shape and scale parameters of 0.01 for the error variance. We can change the default values by using the normalprior() and igammaprior() options.
. bayes, normalprior(10) igammaprior(1 2): regress math5 math3
Model summary | ||
Likelihood: | ||
math5 ~ regress(xb_math5,{sigma2}) | ||
Priors: | ||
{math5:math3 _cons} ~ normal(0,100) (1) | ||
{sigma2} ~ igamma(1,2) | ||
(1) Parameters are elements of the linear form xb_math5. |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
math5 | ||
math3 | .6076875 .033088 .000948 .6076282 .5405233 .673638 | |
_cons | 30.326 .1931568 .005602 30.32804 29.93212 30.70529 | |
sigma2 | 32.09694 1.530839 .034185 32.03379 29.27687 35.37723 | |
In the above, we used a standard deviation of 10 for the normal priors and the shape of 1 and scale of 2 for the inverse-gamma prior.
We can specify custom priors for some of the parameters and leave the priors for other parameters at their defaults. We can also use prior distributions other than the default. For example, below we use a uniform on (-50,50) prior only for the regression coefficients.
. bayes, prior({math5:}, uniform(-50,50)): regress math5 math3
Model summary | ||
Likelihood: | ||
math5 ~ regress(xb_math5,{sigma2}) | ||
Priors: | ||
{math5:math3 _cons} ~ uniform(-50,50) (1) | ||
{sigma2} ~ igamma(.01,.01) | ||
(1) Parameters are elements of the linear form xb_math5. |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
math5 | ||
math3 | .6076356 .0328739 .00094 .6056774 .5433171 .6759952 | |
_cons | 30.33498 .1908214 .005361 30.33159 29.96478 30.70833 | |
sigma2 | 32.15793 1.555689 .031691 32.08331 29.308 35.35561 | |
{math5:} in the above prior() option is a shortcut for referring to all regression coefficients associated with the dependent variable math5: {math5:math3} and {math5:_cons}.
Or we can use custom priors for all model parameters.
. bayes, prior({math5:_cons}, uniform(-50,50)) prior({math5:math3}, uniform(-5,5)) prior({sigma2}, jeffreys): regress math5 math3
Model summary | ||
Likelihood: | ||
math5 ~ regress(xb_math5,{sigma2}) | ||
Priors: | ||
{math5:_cons} ~ uniform(-50,50) (1) | ||
{math5:math3} ~ uniform(-5,5) (1) | ||
{sigma2} ~ jeffreys | ||
(1) Parameters are elements of the linear form xb_math5. |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
math5 | ||
math3 | .6067153 .0333801 .000906 .606437 .5417588 .6723998 | |
_cons | 30.34962 .1935328 .005346 30.35526 29.97142 30.72829 | |
sigma2 | 32.17297 1.546291 .033087 32.1054 29.34673 35.36544 | |
The bayes: regress specification is convenient, but we could already use bayesmh to fit a linear regression. What we cannot do when using bayesmh, for example, is fit an autoregressive model. We can use bayes: regress to do that.
Consider quarterly data on coal consumption in the United Kingdom from 1960 to 1986 (for example, Congdon [2003] and Harvey [1989]). The outcome of interest is coal consumption (in millions of tons) in a given year and quarter. Variable lcoal is transformed using log(coal/1000).
Let's fit a Bayesian autoregressive model with one lag, AR(1), to these data.
. bayes: regress lcoal L.lcoal
Model summary | ||
Likelihood: | ||
lcoal ~ regress(xb_lcoal,{sigma2}) | ||
Priors: | ||
{lcoal:L.lcoal _cons} ~ normal(0,10000) (1) | ||
{sigma2} ~ igamma(.01,.01) | ||
(1) Parameters are elements of the linear form xb_lcoal. |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
lcoal | ||
lcoal | ||
L1. | .7143121 .0649968 .001877 .7123857 .5884089 .8436602 | |
_cons | -.6896604 .1561023 .004433 -.6935272 -.9970502 -.3879924 | |
sigma2 | .1702592 .0243144 .000557 .1672834 .1299619 .2248287 | |
We used Stata's time-series lag operator L. to include the first lag of our dependent variable lcoal in the regression model.
We save MCMC estimates and store estimation results from our Bayesian AR(1) model for later comparison with other AR models.
. bayes, saving(lag1_mcmc) . estimates store lag1
The stationarity assumption of an AR(1) model requires that the first lag coefficient, {lcoal:L.lcoal}, is between -1 and 1. We can use a prior distribution to incorporate this assumption in our Bayesian model. For example, we can specify a uniform on (-1,1) prior for {lcoal:L.lcoal}.
. bayes, prior({lcoal:L.lcoal}, uniform(-1,1)): regress lcoal L.lcoal
Model summary | ||
Likelihood: | ||
lcoal ~ regress(xb_lcoal,{sigma2}) | ||
Priors: | ||
{lcoal:L.lcoal} ~ uniform(-1,1) (1) | ||
{lcoal:_cons} ~ normal(0,10000) (1) | ||
{sigma2} ~ igamma(.01,.01) | ||
(1) Parameters are elements of the linear form xb_lcoal. |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
lcoal | ||
lcoal | ||
L1. | .7111236 .0716918 .008876 .7085994 .5811546 .8647907 | |
_cons | -.6964178 .1747408 .021995 -.7047339 -1.013976 -.3139992 | |
sigma2 | .1709232 .0240107 .000577 .1695033 .1296427 .2222473 | |
The specified uniform prior has little impact on our results.
In the previous section, we fit an AR(1) model to the coal consumption data. Is the AR(1) model sufficient for these data? Do we need a higher-order model such as AR(2) or AR(3)? We can use Bayesian model selection to answer this question.
We will use bayestest model ([BAYES] bayestest model) to compare different AR models using model posterior probabilities. To use bayestest model, we need to fit each model of interest separately and store its estimation results. For brevity, we omit the outputs from the fitted models.
We fit the AR(2) model by including the second lag L2.lcoal of the dependent variable in the regression model. We use bayes's saving() option during estimation to save MCMC estimates in the Stata dataset lag2_mcmc.dta. This dataset is part of Bayesian estimation results, and it must be saved before estimates store can be used. We then use estimates store to store the standard estimation results.
. bayes, saving(lag2_mcmc): regress lcoal L.lcoal L2.lcoal . estimates store lag2
We repeat the above steps for the AR(3), AR(4), and AR(5) models.
. bayes, saving(lag3_mcmc): regress lcoal L(1/3).lcoal . estimates store lag3 . bayes, saving(lag4_mcmc): regress lcoal L(1/4).lcoal . estimates store lag4 . bayes, saving(lag5_mcmc): regress lcoal L(1/5).lcoal . estimates store lag5
We then use bayestest model to compare the models.
. bayestest model lag1 lag2 lag3 lag4 lag5 Bayesian model tests
log(ML) P(M) P(M|y) | ||
lag1 | -75.8897 0.2000 0.0000 | |
lag2 | -82.5078 0.2000 0.0000 | |
lag3 | -59.6688 0.2000 0.0000 | |
lag4 | -13.8944 0.2000 0.9990 | |
lag5 | -20.8194 0.2000 0.0010 | |
Given equal prior probabilities for all five AR models, the AR(4) model has the highest posterior probability of 0.9990.
Given that our data are quarterly, it is not surprising that the fourth lag is so important. It is comforting that our data confirm that importance.
Editors note: This section uses uniquely Bayesian thinking to address the problem of model selection. It is a wonderful solution, but be forewarned that several frequentists at StataCorp are still scratching their heads. You can skip the section if you wish.
In the previous section, we used model posterior probabilities to choose the autocorrelation lag. Instead, we can extend our Bayesian autoregressive model to estimate the lag. Let the autocorrelation lag be another parameter in the model. Suppose that it can take any integer value between 1 and 5. Make prior distributions of the lag coefficients depend on the lag. If the estimated lag parameter is greater or equal to p, assign the pth lag coefficient a normal prior with mean 0 and a variance of 100. Otherwise, assign it a normal prior with a mean 0 and a variance of 0.01. In other words, if the estimated lag parameter is equal to p, we shrink all of the lag coefficients corresponding to lags greater than p toward zero.
The arguments of the prior distributions in the prior() option can be specified as expressions. We use this feature to accommodate the above priors for the lag coefficients,
. bayes, prior({lcoal:L1.lcoal}, normal(0, cond({lag}>=1,100,0.01))) ...
where {lag} is the autocorrelation lag parameter.
Our final model specification is as follows, in which {lag} is assigned a discrete prior with equal probabilities for each of the possible values 1, 2, 3, 4, and 5.
. bayes, prior({lcoal:L1.lcoal}, normal(0, cond({lag}>=1,100,0.01))) prior({lcoal:L2.lcoal}, normal(0, cond({lag}>=2,100,0.01))) prior({lcoal:L3.lcoal}, normal(0, cond({lag}>=3,100,0.01))) prior({lcoal:L4.lcoal}, normal(0, cond({lag}>=4,100,0.01))) prior({lcoal:L5.lcoal}, normal(0, cond({lag}>=5,100,0.01))) prior({lag}, index(0.2,0.2,0.2,0.2,0.2)): regress lcoal L(1/5).lcoal
Model summary | ||
Likelihood: | ||
lcoal ~ regress(xb_lcoal,{sigma2}) | ||
Priors: | ||
{lcoal:L.lcoal} ~ normal(0,cond({lag}>=1,100,0.01)) (1) | ||
{lcoal:L2.lcoal} ~ normal(0,cond({lag}>=2,100,0.01)) (1) | ||
{lcoal:L3.lcoal} ~ normal(0,cond({lag}>=3,100,0.01)) (1) | ||
{lcoal:L4.lcoal} ~ normal(0,cond({lag}>=4,100,0.01)) (1) | ||
{lcoal:L5.lcoal} ~ normal(0,cond({lag}>=5,100,0.01)) (1) | ||
{lcoal:_cons} ~ normal(0,10000) (1) | ||
{sigma2} ~ igamma(.01,.01) | ||
Hyperprior | ||
{lag} ~ index(0.2,0.2,0.2,0.2,0.2) | ||
(1) Parameters are elements of the linear form xb_lcoal. |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
lcoal | ||
lcoal | ||
L1. | .2062446 .0784492 .011311 .2050062 .0487352 .3605725 | |
L2. | -.0738366 .0588681 .002764 -.0739381 -.1877364 .0391768 | |
L3. | .100462 .0597828 .004398 .1003963 -.0142032 .2216838 | |
L4. | .7994076 .0606384 .006607 .8031808 .6651497 .910174 | |
L5. | -.0729926 .0698683 .009211 -.0708155 -.2074388 .060126 | |
_cons | -.1401982 .0812334 .015212 -.1438271 -.2877263 .0403175 | |
sigma2 | .0343128 .0051157 .000123 .0338508 .0256253 .0456132 | |
lag | 4.0194 .1379331 .004424 4 4 4 | |
The estimated lag is 4, which is consistent with our findings based on posterior probabilities of the considered AR models in the previous section.
Bayesian regression models can be useful in the presence of perfect predictors. Suppose that we want to model the binary outcome disease, the presence of a heart disease, as a function of a number of covariates: age, gender, indicator for fasting blood sugar greater than 120 mg/dl (variable isfbs), and categories for resting electrocardiographic results (variable restecg). This example is described in detail in Logistic regression with perfect predictors of [BAYES] bayes. The data are a sample from Switzerland.
Let's fit the standard logistic model first.
. logit disease restecg isfbs age male note: restecg != 0 predicts success perfectly restecg dropped and 17 obs not used note: isfbs != 0 predicts success perfectly isfbs dropped and 3 obs not used note: male != 1 predicts success perfectly male dropped and 2 obs not used Logistic regression Number of obs = 26 LR chi2(1) = 0.01 Prob > chi2 = 0.9403 Log likelihood = -4.2358076 Pseudo R2 = 0.0007
disease | Coef. Std. Err. z P>|z| [95% Conf. Interval] | |
restecg | 0 (omitted) | |
isfbs | 0 (omitted) | |
age | -.0097846 .1313502 -0.07 0.941 -.2672263 .2476572 | |
male | 0 (omitted) | |
_cons | 3.763893 7.423076 0.51 0.612 -10.78507 18.31285 | |
Variables restecg, isfbs, and male are omitted from the model because they predict the success probability perfectly for these data. In other words, the corresponding coefficients cannot be identified based on the observed data. The problem of perfect predictors often arises in binary-outcome models, especially with small datasets as in our example.
Sometimes, we may have reliable external information about the parameters of interest that can help identify them when the data alone cannot. For example, a similar heart study was conducted in Hungary. The estimates of the parameters from that study may be used to form the informative prior distributions for the parameters of the Swiss study.
We may consider a Bayesian logistic regression with the following informative priors.
. bayes, prior({disease:restecg age}, normal(0,10)) prior({disease:isfbs male}, normal(1,10)) prior({disease:_cons}, normal(-4,10)) nomleinitial: logit disease restecg isfbs age male, asis
Model summary | ||
Likelihood: | ||
disease ~ logit(xb_disease) | ||
Priors: | ||
{disease:restecg age} ~ normal(0,10) (1) | ||
{disease:isfbs male} ~ normal(1,10) (1) | ||
{disease:_cons} ~ normal(-4,10) (1) | ||
(1) Parameters are elements of the linear form xb_disease. |
Equal-tailed | ||
disease | Mean Std. Dev. MCSE Median [95% Cred. Interval] | |
restecg | 1.965122 2.315475 .115615 1.655961 -2.029873 6.789415 | |
isfbs | 1.708631 2.726071 .113734 1.607439 -3.306837 7.334592 | |
age | .1258811 .0707431 .003621 .1245266 -.0016807 .2719748 | |
male | .2671381 2.237349 .162967 .3318061 -4.106425 4.609955 | |
_cons | -2.441911 2.750613 .110611 -2.538183 -7.596747 3.185172 | |
Whenever informative priors are used, the sensitivity of the results to the choice of the priors must be explored. For example, it will be useful to consider other heart disease studies to verify the reasonableness of the prior specifications and to use other priors to check the sensitivity of the estimated parameters to the choice of priors.
Just like with other models, to fit Bayesian generalized linear models, we can simply prefix Stata's glm command with bayes:.
Example 2 in [R] glm analyzes data from an insecticide experiment. The effect of the dose of insecticide on the number of flour beetles killed is of interest. Variable ldose records the log dose of insecticide; n, the number of flour beetles subjected to each dose; and r, the number killed. Two models are considered: binomial regression with a logit link and with a complementary log-log link. Below we fit Bayesian analogs of these models and compare them using the deviance information criterion (DIC).
We fit Bayesian binomial regression with a logit link by typing
. bayes: glm r ldose, family(binomial n)
Model summary | ||
Likelihood: | ||
r ~ glm(xb_r) | ||
Prior: | ||
{r:ldose _cons} ~ normal(0,10000) (1) | ||
(1) Parameters are elements of the linear form xb_r. |
Equal-tailed | ||
r | Mean Std. Dev. MCSE Median [95% Cred. Interval] | |
ldose | 34.48517 2.816542 .090987 34.43137 29.12495 40.06385 | |
_cons | -61.09645 5.005658 .161527 -61.0278 -71.01371 -51.59132 | |
We save the MCMC results and store the estimation results for later model comparison.
. bayes, saving(logit_mcmc) . estimates store logit
We fit Bayesian binomial regression with a complementary log-log link as follows. This time, we save the MCMC results during estimation.
. bayes, saving(cloglog_mcmc): glm r ldose, family(binomial n) link(cloglog) Bayesian generalized linear models MCMC iterations = 12,500 Random-walk Metropolis-Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Family : binomial n Number of obs = 8 Link : complementary log-log Scale parameter = 1 Acceptance rate = .2069 Efficiency: min = .1264 avg = .1265 Log marginal likelihood = -26.111739 max = .1265
Equal-tailed | ||
r | Mean Std. Dev. MCSE Median [95% Cred. Interval] | |
ldose | 22.17523 1.750735 .049226 22.15812 18.93784 25.68367 | |
_cons | -39.81636 3.152542 .088657 -39.78953 -46.13103 -33.98554 | |
We now use bayesstats ic ([BAYES] bayesstats ic) to compare the models.
. bayesstats ic logit cloglog Bayesian information criteria
DIC log(ML) log(BF) | ||
logit | 41.26855 -29.16791 . | |
cloglog | 33.51447 -26.11174 3.056174 | |
The complementary log-log model is preferable because its DIC value is smaller.
Congdon, P. 2003. Applied Bayesian Modeling. New York: John Wiley & Sons.
Harvey, A. C. 1989. Forecasting, Structural Time Series Models, and the Kalman Filter. Cambridge: Cambridge University Press.
Mortimore, P., P. Sammons, L. Stoll, D. Lewis, and R. Ecob. 1988. School Matters: The Junior Years. Wells, Somerset, UK: Open Books.
Learn more about new Bayesian-analysis features.
Learn more about Bayesian multilevel models, Bayesian panel-data models, Bayesian survival models, and Bayesian sample-selection models.
Learn more about Stata's Bayesian analysis features.
Read more about the bayes prefix and Bayesian analysis in the Stata Bayesian Analysis Reference Manual.