Home  /  Products  /  StataNow  /  Bayesian quantile regression

<- See more new Stata features

Highlights

• Bayesian estimates of quantile regression coefficients

• Flexible prior specifications

• Comprehensive posterior inference

• Model-based “standard errors”

• Full support of Bayesian postestimation features

• See more Bayesian analysis features

The new bayes: qreg command fits Bayesian quantile regression. The Bayesian framework provides full posterior distributions for quantile regression coefficients that offer comprehensive inference, including model-based “standard errors”. All standard Bayesian features, such as hypothesis testing and prediction, are supported. This command is part of StataNow™.

Quantile regression models the conditional quantiles of an outcome as a linear combination of predictors. Traditional quantile regression relies on a specific set of loss functions and linear programming for estimation. To introduce Bayesian quantile regression, Yu and Moyeed (2001) use an equivalent formulation for a quantile regression that assumes an asymmetric Laplace distribution for the likelihood function. Bayesian quantile regression combines this likelihood formulation with priors for model parameters to form a posterior model and uses Markov chain Monte Carlo (MCMC) for estimation. This provides full posterior distributions of model parameters for comprehensive inference, including model-based “standard errors”.

In classical quantile regression, standard errors are computed by using bootstrap or kernel-based methods. In the Bayesian framework, posterior standard deviations play the role of standard errors. By assuming a parametric likelihood model, the posterior standard deviations are estimated based on that model and may be more efficient.

Here we demonstrate a univariate Bayesian quantile regression. For other Bayesian quantile models, including random effects and multiple quantiles, see Bayesian asymmetric Laplace model.

#### Let's see it work

Let's explore the relationship between household income and food expenditure using the data from Engel (1857), which are described in Koenker and Bassett (1982). Let's use a quantile regression to compare this relationship across different quantiles. We first fit a model to the 50th percentile of the outcome variable, a model known as median regression, using default settings. We specify the rseed(19) option for reproducibility.

. webuse engel1857
(European household budget survey)

. bayes, rseed(19): qreg foodexp income

Burn-in ...
Simulation ...

Model summary

Likelihood:
foodexp ~ asymlaplaceq(xb_foodexp_q50,{sigma},.5)

Priors:
{foodexp_q50:income _cons} ~ normal(0,10000)                             (1)
{sigma} ~ igamma(0.01,0.01)

(1) Parameters are elements of the linear form xb_foodexp_q50.

Bayesian quantile regression                     MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Quantile = .5                                    Number of obs    =        235
Acceptance rate  =      .3603
Efficiency:  min =     .09896
avg =       .151
Log marginal-likelihood =  186.43947                          max =      .2268

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

foodexp_q50
income    .5567276   .0159401   .000507   .5562547   .5248025    .587735
_cons     .084986   .0143782   .000403   .0851108   .0575581   .1134264

sigma    .0377533   .0024907   .000052   .0376511   .0331066   .0430957



The mean posterior estimate for the coefficient of income is 0.56 with a 95% credible interval (CrI) of [0.52, 0.59]. We now shift our attention to the 25th percentile (or 0.25 quantile) of the outcome variable by specifying the quantile() option with qreg.

. bayes, rseed(19): qreg foodexp income, quantile(0.25)

Burn-in ...
Simulation ...

Model summary

Likelihood:
foodexp ~ asymlaplaceq(xb_foodexp_q25,{sigma},.25)

Priors:
{foodexp_q25:income _cons} ~ normal(0,10000)                             (1)
{sigma} ~ igamma(0.01,0.01)

(1) Parameters are elements of the linear form xb_foodexp_q25.

Bayesian quantile regression                     MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Quantile = .25                                   Number of obs    =        235
Acceptance rate  =      .3423
Efficiency:  min =      .1436
avg =      .1765
Log marginal-likelihood =  169.18624                          max =      .2421

Equal-tailed
Mean   Std. dev.     MCSE     Median  [95% cred. interval]

foodexp_q25
income    .4718604   .0140225    .00037   .4735463   .4414884   .4948657
_cons    .0962851   .0116976   .000308   .0957929   .0742573   .1196877

sigma    .0304463   .0020364   .000041   .0303373   .0266857   .0347907



The mean posterior estimate for the coefficient of income is 0.47 with a 95% CrI of [0.44, 0.49]. The CrIs from the two quantile regressions are not overlapping, which suggests that the relationship between income and food expenditure is different between the 0.25 and 0.50 quantiles. We can explore this relationship further by specifying different quantiles in the quantile() option with bayes: qreg and using the results to produce the graph below:

The graph demonstrates heterogeneity of the income coefficients across the distribution (quantiles) of food expenditure. The coefficient increases as the quantile value increases. (This does not mean that the proportion of income spent on food increases with income. If we were to obtain predictions and produce the Engel curve, we would see that food expenditure share decreases with income, as expected.)

All existing Bayesian postestimation commands are available after bayes: qreg. For example, we can compute the posterior probability for the income coefficient in the model for the 25th percentile of foodexp to be within the interval [0.525, 0.588]—the 95% CrI obtained from the median regression model. To accomplish this, we use the bayestest interval command.

. bayestest interval {foodexp_q25:income}, lower(.525) upper(.588)

Interval tests     MCMC sample size =    10,000

prob1 : .525 < {foodexp_q25:income} < .588

Mean    Std. dev.      MCSE

prob1           0     0.00000          0



The estimated posterior probability is 0, which suggests that the effect of income on food expenditure differs between the 25th and 50th percentiles.

#### References

Engel, E. 1857. Die Productions-und Consumtionsver-haltnisse des Konigreichs Sachsen. Zeitschrift des Statistischen Bureaus des Koniglich Sachsischen Ministeriums des Innern 8: 1–54.

Koenker, R., and G. Bassett, Jr. 1982. Robust tests for heteroscedasticity based on regression quantiles. Econometrica 50: 43–61. https://doi.org/10.2307/1912528.

Yu, K., and R. A. Moyeed. 2001. Bayesian quantile regression. Statistics & Probability Letters 54: 437–447.