Home  /  Products  /  StataNow  /  Bayesian asymmetric Laplace model

<- See more new Stata features

Highlights

• Asymmetric Laplace distribution (ALD) to model skewed outcomes

• Flexible prior specifications

• Bayesian quantile estimation via ALD

• Simultaneous quantile estimation by using multiple equations

• Include random effects at multiple levels of hierarchy

• Model quantiles using nonlinear functions of predictors

• Full support of Bayesian postestimation features

• See more Bayesian analysis features

The bayesmh command now includes an asymmetric Laplace distribution (ALD) as a new likelihood function. You can use ALD to model nonnormal outcomes with pronounced skewness and kurtosis. You can also use it to fit Bayesian quantile regression models (Yu and Moyeed 2001). For a Bayesian univariate quantile regression, see the new bayes: qreg command. With bayesmh, you can fit Bayesian simultaneous, multilevel, and nonlinear quantile regression models. These features are part of StataNow™.

#### Random-effects quantile regression

Consider the mathscores dataset, which contains math scores of pupils in the third year (math3) and fifth year (math5) from different schools (school) in Inner London (Mortimore et al. 1988).

. webuse mathscores

. describe

Contains data from https://www.stata-press.com/data/r18/mathscores.dta
Observations:           887
Variables:             3                  9 May 2022 23:31

Variable      Storage   Display    Value
name         type    format    label      Variable label

school          float   %9.0g         	      School ID
math3           float   %9.0g         	      Year 3 math score
math5           float   %9.0g         	      Year 5 math score
Sorted by:


We are interested in fitting several Bayesian quantile regressions of math5 on math3 for different quantiles of math5. We also want to include a random intercept at the school level in the model to account for the between-school variability of the math scores.

Let's first fit a Bayesian random-intercept model for the median. We specify the asymlaplaceq({scale},0.5) likelihood option with a random scale parameter {scale} and a quantile value of 0.5. We use fairly uninformative priors for the coefficients (normal(0,10000)), scale (igamma(0.01,0.01)), and random-intercept variance (igamma(0.01,0.01)). We use blocking to improve the Markov chain Monte Carlo (MCMC) sampling and specify a random seed for reproducibility.

. bayesmh math5 math3 U[school], likelihood(asymlaplaceq({scale},0.5))
prior({math5:}, normal(0,10000)) block({math5:})
prior({scale},  igamma(0.01,0.01)) block({scale})
prior({var_U},  igamma(0.01,0.01)) rseed(19)

Burn-in 2500 aaaaaaaaa1000aaaaaaaaa2000aaaaa done
Simulation 10000 .........1000.........2000.........3000.........4000.........50
> 00.........6000.........7000.........8000.........9000.........10000 done

Model summary

Likelihood:
math5 ~ asymlaplaceq(xb_math5,{scale},0.5)

Priors:
{math5:math3 _cons} ~ normal(0,10000)                                    (1)
{U[school]} ~ normal(0,{var_U})                                  (1)
{scale} ~ igamma(0.01,0.01)

Hyperprior:
{var_U} ~ igamma(0.01,0.01)

(1) Parameters are elements of the linear form xb_math5.

Bayesian asymm. Laplace (quantile) regression    MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Number of obs    =        887
Acceptance rate  =      .3555
Efficiency:  min =     .01064
avg =     .08481
Log marginal-likelihood                                       max =      .2189

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
math3    .5771888   .0287285   .001293   .5769971   .5198219   .6337824
_cons    31.14144   .3642311   .035305    31.1465    30.4084   31.86673

scale    1.983294   .0683036    .00146   1.981383   1.853755   2.120894
var_U    3.679771   1.057536   .043052   3.561201   2.072422   6.134616



The posterior mean estimate for the math3 coefficient is 0.58, and the variance of the random intercept is 3.68. If we were to fit a Bayesian two-level linear regression that models the conditional mean using the same prior distributions, we would find that its estimates are similar to the estimates obtained from the median regression. A question of interest might be whether the regression coefficients change across quantiles. We can use the estimates from the median regression as a baseline for comparing models for other quantiles.

#### Random-effects simultaneous quantile regression

Let's use bayesmh to fit two Bayesian quantile regressions simultaneously using a two-equation specification. For example, let's consider the 0.25 and 0.75 quantiles. We label the two regression equations as math5_q25 and math5_q75 to avoid duplication of the outcome variable. Similarly to the previous model, we use uninformative priors for coefficients and variance parameters. We also use a common scale parameter {scale} because the outcome variable is the same in the two equations. The burn-in length is increased to 5,000 to improve sampling adaptation.

. bayesmh
(math5_q25:math5 math3 U1[school], likelihood(asymlaplaceq({scale},0.25)))
(math5_q75:math5 math3 U2[school], likelihood(asymlaplaceq({scale},0.75))),
prior({math5_q25:}, normal(0,10000)) block({math5_q25:})
prior({math5_q75:}, normal(0,10000)) block({math5_q75:})
prior({scale},      igamma(0.01,0.01))
prior({var_U1},     igamma(0.01,0.01)) block({var_U1})
prior({var_U2},     igamma(0.01,0.01)) block({var_U2})
burnin(5000) rseed(19)

Burn-in 5000 aaaaaaaaa1000aaaaaaaaa2000aaaaaaaaa3000aaaaaaaaa4000aaaaaaaaa5000
> done
Simulation 10000 .........1000.........2000.........3000.........4000.........50
> 00.........6000.........7000.........8000.........9000.........10000 done

Model summary

Likelihood:
math5 ~ asymlaplaceq(math5 math3 U1[school],{scale},0.25)
math5 ~ asymlaplaceq(math5 math3 U2[school],{scale},0.75)

Prior:
{scale} ~ igamma(0.01,0.01)

Hyperpriors:
{math5_q25:math3 U1 _cons} ~ normal(0,10000)
{math5_q75:math3 U2 _cons} ~ normal(0,10000)
{var_U1 var_U2} ~ igamma(0.01,0.01)
{U1[school]} ~ normal(0,{var_U1})
{U2[school]} ~ normal(0,{var_U2})

Bayesian asymm. Laplace (quantile) regression    MCMC iterations  =     15,000
Random-walk Metropolis–Hastings sampling         Burn-in          =      5,000
MCMC sample size =     10,000
Number of obs    =        887
Acceptance rate  =       .339
Efficiency:  min =     .01429
avg =     .06357
Log marginal-likelihood                                       max =      .2034

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5_q25
math3     .781409   .0331454   .001575   .7818769   .7174074    .844201
U1           1          0         0          1          1          1
_cons    27.11279   .4114259    .03442   27.09728   26.33384   27.93378

math5_q75
math3    .4021812   .0285305     .0013   .4029019    .344931   .4566547
U2           1          0         0          1          1          1
_cons    34.21954   .2754823   .018921   34.22476   33.66447   34.76017

scale    1.557948   .0373122   .000827   1.557052   1.486446   1.632176
var_U1    7.793937   2.041174   .076138   7.560785   4.519766   12.38803
var_U2    2.256973   .7545756   .036895   2.153597   1.128305   4.057043



The posterior mean estimates of the math3 coefficient for the 0.25 and 0.75 quantiles are 0.78 and 0.40, respectively, which are above and below the earlier median estimate of 0.58. It appears that the regression slope decreases as the quantile value increases. In particular, we can conclude that math3 is a stronger predictor of the 25th percentile of math5 scores than of the 75th percentile of math5 scores.

#### Modeling outcomes using asymmetric Laplace distribution

Finally, we use ALD to examine the skewness (asymmetry) of the math5 variable. For this purpose, we use asymlaplaceq() with a random quantile parameter {tau}. A negative skewness is indicated by {tau} greater than 0.5; positive skewness is indicated by {tau} less than 0.5; and {tau} of 0.5 indicates no skewness, a perfect symmetry. We apply a flat uniform(0, 1) prior for {tau}. We specify a regression model with only a constant term for math5, which corresponds to the location parameter of ALD, and a scale parameter {scale} as in the previous examples.

. bayesmh math5, likelihood(asymlaplaceq({scale}, {tau}))
prior({math5:}, normal(0,10000))
prior({scale},  igamma(0.01,0.01))
prior({tau},    uniform(0,1))
block({math5:} {scale}{tau}, split)
initial({tau} 0.5) rseed(19)

Burn-in ...
Simulation ...

Model summary

Likelihood:
math5 ~ asymlaplaceq({math5:_cons},{scale},{tau})

Priors:
{math5:_cons} ~ normal(0,10000)
{scale} ~ igamma(0.01,0.01)
{tau} ~ uniform(0,1)

Bayesian asymm. Laplace (quantile) regression    MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Number of obs    =        887
Acceptance rate  =       .424
Efficiency:  min =     .03815
avg =     .05242
Log marginal-likelihood = -2846.2182                          max =     .07833

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

math5
_cons    37.00486    .045309   .001619   37.00222    36.9161   37.10699

scale    1.073886   .0655504   .003246   1.071063   .9511403   1.211991
tau    .8602184   .0085491   .000438   .8601294    .843211   .8767714



We see that the estimated 95% credible interval of [0.84, 0.88] suggests that {tau} is greater than 0.5, indicating a strong negative skewness. A more formal test for negative skewness can be conducted by using the bayestest interval command.

. bayestest interval {tau}, lower(0.5)

Interval tests     MCMC sample size =    10,000

prob1 : {tau} > 0.5

Mean    Std. dev.      MCSE

prob1           1     0.00000          0



Indeed, the posterior probability of {tau} greater than 0.5 is 1.

#### References

Mortimore, P., P. Sammons, L. Stoll, D. Lewis, and R. Ecob. 1988. School Matters: The Junior Years. Wells, Somerset, UK: Open Books.

Yu, K., and R. A. Moyeed. 2001. Bayesian quantile regression. Statistics & Probability Letters 54: 437–447.