Search
   >> Home >> Products >> Features >> Linear models >> Quantile regression

Quantile regression

  • Including median, minimization of sums of absolute deviations
  • There are now three ways to obtain the VCE:
    • the standard Koenker and Bassett method appropriate for i.i.d. errors;
    • a Huber sandwich estimator that can be used even if the errors are not i.i.d.;
    • the bootstrap.
    For the first two VCE methods above, there are many choices of bandwidth methods and kernels to select from.

Stata fits quantile (including median) regression models, also known as least-absolute value (LAV) models, minimum absolute deviation (MAD) models, and L1-norm models.

Median regression estimates the median of the dependent variable, conditional on the values of the independent variable. This is similar to least-squares regression, which estimates the mean of the dependent variable. Said differently, median regression finds the regression plane that minimizes the sum of the absolute residuals rather than the sum of the squared residuals.

. webuse auto (1978 Automobile Data) . qreg price weight length foreign Iteration 1: WLS sum of weighted deviations = 112795.66 Iteration 1: sum of abs. weighted deviations = 111901 Iteration 2: sum of abs. weighted deviations = 110529.43 Iteration 3: sum of abs. weighted deviations = 109524.57 Iteration 4: sum of abs. weighted deviations = 109468.3 Iteration 5: sum of abs. weighted deviations = 109105.27 note: alternate solutions exist Iteration 6: sum of abs. weighted deviations = 108931.02 Iteration 7: sum of abs. weighted deviations = 108887.4 Iteration 8: sum of abs. weighted deviations = 108822.59 Median regression Number of obs = 74 Raw sum of deviations 142205 (about 4934) Min sum of deviations 108822.6 Pseudo R2 = 0.2347
price Coef. Std. Err. t P>|t| [95% Conf. Interval]
weight 3.933588 1.328718 2.96 0.004 1.283543 6.583632
length -41.25191 45.46469 -0.91 0.367 -131.9284 49.42456
foreign 3377.771 885.4198 3.81 0.000 1611.857 5143.685
_cons 344.6489 5182.394 0.07 0.947 -9991.31 10680.61

By default, qreg performs median regression—the estimates above were obtained by minimizing the sums of the absolute residuals.

By comparison, the results from least-squares regression are

. regress price weight length foreign
Source SS df MS Number of obs = 74
F( 3, 70) = 28.39
Model 348565467 3 116188489 Prob > F = 0.0000
Residual 286499930 70 4092856.14 R-squared = 0.5489
Adj R-squared = 0.5295
Total 635065396 73 8699525.97 Root MSE = 2023.1
price Coef. Std. Err. t P>|t| [95% Conf. Interval]
weight 5.774712 .9594168 6.02 0.000 3.861215 7.688208
length -91.37083 32.82833 -2.78 0.007 -156.8449 -25.89679
foreign 3573.092 639.328 5.59 0.000 2297.992 4848.191
_cons 4838.021 3742.01 1.29 0.200 -2625.183 12301.22

qreg can also estimate the regression plane for quantiles other than the 0.5 (median). For instance, the following model describes the 25th percentile (.25 quantile) of price:

. qreg price weight length foreign, quantile(.25) Iteration 1: WLS sum of weighted deviations = 98938.47 Iteration 1: sum of abs. weighted deviations = 99457.766 Iteration 2: sum of abs. weighted deviations = 91339.779 Iteration 3: sum of abs. weighted deviations = 86833.292 Iteration 4: sum of abs. weighted deviations = 83894.441 Iteration 5: sum of abs. weighted deviations = 82186.051 Iteration 6: sum of abs. weighted deviations = 75246.848 Iteration 7: sum of abs. weighted deviations = 71442.907 Iteration 8: sum of abs. weighted deviations = 70452.617 Iteration 9: sum of abs. weighted deviations = 69646.639 Iteration 10: sum of abs. weighted deviations = 69603.553 .25 Quantile regression Number of obs = 74 Raw sum of deviations 83825.5 (about 4187) Min sum of deviations 69603.55 Pseudo R2 = 0.1697
price Coef. Std. Err. t P>|t| [95% Conf. Interval]
weight 1.831789 .6328903 2.89 0.005 .5695289 3.094049
length 2.84556 21.65558 0.13 0.896 -40.34514 46.03626
foreign 2209.925 421.7401 5.24 0.000 1368.791 3051.059
_cons -1879.775 2468.46 -0.76 0.449 -6802.963 3043.413

Here, we perform median regression but request robust standard errors.

. qreg price weight length foreign, vce(robust) Iteration 1: WLS sum of weighted deviations = 112795.66 Iteration 1: sum of abs. weighted deviations = 111901 Iteration 2: sum of abs. weighted deviations = 110529.43 Iteration 3: sum of abs. weighted deviations = 109524.57 Iteration 4: sum of abs. weighted deviations = 109468.3 Iteration 5: sum of abs. weighted deviations = 109105.27 note: alternate solutions exist Iteration 6: sum of abs. weighted deviations = 108931.02 Iteration 7: sum of abs. weighted deviations = 108887.4 Iteration 8: sum of abs. weighted deviations = 108822.59 Median regression Number of obs = 74 Raw sum of deviations 142205 (about 4934) Min sum of deviations 108822.6 Pseudo R2 = 0.2347
  Robust
price Coef. Std. Err. t P>|t| [95% Conf. Interval]
weight 3.933588 1.694477 2.32 0.023 .55406 7.313116
length -41.25191 51.73571 -0.80 0.428 -144.4355 61.93171
foreign 3377.771 728.5115 4.64 0.000 1924.801 4830.741
_cons 344.6489 5096.528 0.07 0.946 -9820.055 10509.35

Stata can provide bootstrapped standard errors, using the bsqreg command

. set seed 1001 . bsqreg price weight length foreign (fitting base model) Bootstrap replications (20)
1 2 3 4 5
.................... Median regression, bootstrap(20) SEs Number of obs = 74 Raw sum of deviations 142205 (about 4934) Min sum of deviations 108822.6 Pseudo R2 = 0.2347
price Coef. Std. Err. t P>|t| [95% Conf. Interval]
weight 3.933588 3.12446 1.26 0.212 -2.297951 10.16513
length -41.25191 83.71267 -0.49 0.624 -208.2116 125.7077
foreign 3377.771 1057.281 3.19 0.002 1269.09 5486.452
_cons 344.6489 7053.301 0.05 0.961 -13722.72 14412.01

The coefficient estimates are the same as those in the first example. The standard errors, and, therefore, the t statistics, significance levels, and confidence intervals differ.

Stata can also perform simultaneous-quantile regression. With simultaneous-quantile regression, we can estimate multiple quantile regressions simultaneously:

. set seed 1001 . sqreg price weight length foreign, q(.25 .5 .75) (fitting base model) Bootstrap replications (20)
1 2 3 4 5
.................... Simultaneous quantile regression Number of obs = 74 bootstrap(20) SEs .25 Pseudo R2 = 0.1697 .50 Pseudo R2 = 0.2347 .75 Pseudo R2 = 0.3840
  Bootstrap
price Coef. Std. Err. t P>|t| [95% Conf. Interval]
q25
weight 1.831789 .9837745 1.86 0.067 -.1302874 3.793865
length 2.84556 23.78166 0.12 0.905 -44.58546 50.27658
foreign 2209.925 791.2162 2.79 0.007 631.8944 3787.956
_cons -1879.775 2862.855 -0.66 0.514 -7589.559 3830.01
q50
weight 3.933588 2.800286 1.40 0.165 -1.651408 9.518583
length -41.25191 76.813 -0.54 0.593 -194.4506 111.9468
foreign 3377.771 1099.259 3.07 0.003 1185.369 5570.173
_cons 344.6489 6764.42 0.05 0.960 -13146.56 13835.86
q75
weight 9.22291 2.903055 3.18 0.002 3.43295 15.01287
length -220.7833 104.5321 -2.11 0.038 -429.2661 -12.30048
foreign 3595.133 1051.723 3.42 0.001 1497.539 5692.728
_cons 20242.9 11054.76 1.83 0.071 -1805.126 42290.93

We can test whether the effect of weight is the same at the 25th and 75th percentiles:

. test[q25]weight = [q75]weight ( 1) [q25]weight - [q75]weight = 0 F( 1, 70) = 7.87 Prob > F = 0.0065

We can obtain a confidence interval for the difference in the effect of weight at the 25th and 75th percentiles:

. lincom [q75]weight-[q25]weight ( 1) - [q25]weight + [q75]weight = 0
price Coef. Std. Err. t P>|t| [95% Conf. Interval]
(1) 7.391121 2.634484 2.81 0.006 2.13681 12.64543

Stata also performs interquantile regression, which focuses on one quantile comparison:

. set seed 1001 . iqreg price weight length foreign, q(.25 .75) (fitting base model) Bootstrap replications (20)
1 2 3 4 5
.................... .75-.25 Interquantile regression Number of obs = 74 bootstrap(20) SEs .75 Pseudo R2 = 0.3840 .25 Pseudo R2 = 0.1697
  Bootstrap
price Coef. Std. Err. t P>|t| [95% Conf. Interval]
weight 7.391121 2.634484 2.81 0.006 2.13681 12.64543
length -223.6288 98.00504 -2.28 0.026 -419.0937 -28.16396
foreign 1385.208 1080.812 1.28 0.204 -770.4034 3540.819
_cons 22122.68 10940.56 2.02 0.047 302.4105 43942.95

References

Gould, W. 1992.
sg11.1: Quantile regression with bootstrapped standard errors. Stata Technical Bulletin 9: 19–21. Reprinted in Stata Technical Bulletin Reprints, vol. 2, pp. 137–150.
Gould, W., and W. H. Rogers. 1994.
Quantile regression as an alternative to robust regression. Proceedings of the Statistical Computing Section. Alexandria, VA: American Statistical Association.
Rogers, W. H. 1992.
sg11: Quantile regression standard errors. Stata Technical Bulletin 9: 16–19. Reprinted in Stata Technical Bulletin Reprints, vol. 2, pp. 133–137.
------. 1993.
sg11.2: Calculation of quantile regression standard errors. Stata Technical Bulletin 13: 18–19. Reprinted in Stata Technical Bulletin, vol. 3, pp. 77–78.
The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube