How do I fit a linear regression with interval (inequality) constraints in Stata?
|
Title
|
|
Fitting a Gaussian or lognormal regression with interval(inequality) constraints
|
|
Author
|
Isabel Canette, StataCorp
|
In the FAQ How do I fit a regression with interval (inequality) constraints in Stata? we explain how to include interval constraints in a general regression that can be fit by maximum likelihood. However, if you have a linear or nonlinear regression with Gaussian (or lognormal) errors, the simplest way to include these kinds of constraints is by using the nl command.
Introduction
First, let's review how to fit a linear regression using
nl. We will use this command to fit a regression of
mpg2 on price and turn:
. sysuse auto
(1978 automobile data)
. generate mpg2 = -mpg
. nl (mpg2 = {a}*price + {b}*turn + {c})
Iteration 0: Residual SS = 36008
Iteration 1: Residual SS = 1016.186
Iteration 2: Residual SS = 1016.1859
| Source | | SS df MS | |
| | | | Number of obs = 74 |
| Model | | 1427.2735 2 713.636766 | R-squared = 0.5841 |
| Residual | | 1016.1859 71 14.3124778 | Adj R-squared = 0.5724 |
| | | | Root MSE = 3.783184 |
| Total | | 2443.4595 73 33.4720474 | Res. dev. = 403.8641 |
|
| mpg2 | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | | |
| /a | | .0005335 .0001579 3.38 0.001 .0002187 .0008483 |
| /b | | .8350376 .1058498 7.89 0.000 .6239791 1.046096 |
| /c | | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326 |
|
Note: Parameter c is used as a constant term during estimation.
We will set inequality constraints (and interval constraints) via
transformations. For example, let's assume that we want parameter
a to be positive. This can be achieved by expressing
a as an exponential. We can, therefore, estimate
lna = ln(a) and then recover a = exp(lna)
after the estimation. The trick is to use a transformation whose
range is the interval over which we want to restrict the parameter.
Type
. help math functions
to see the mathematical functions available in Stata.
There are many ways to set interval constraints. The following examples show
some possibilities.
Example 1: Constraints of the form a > 0
As stated before, we will estimate ln(a) instead of a.
. nl (mpg2 = {a:}*price + {b}*turn + {c}), define(a:exp({lna}))
Iteration 0: Residual SS = 3.466e+09
Iteration 1: Residual SS = 261256.11
Iteration 2: Residual SS = 22313.675
Iteration 3: Residual SS = 1058.9108
Iteration 4: Residual SS = 1016.2076
Iteration 5: Residual SS = 1016.1859
Iteration 6: Residual SS = 1016.1859
| Source | | SS df MS | |
| | | | Number of obs = 74 |
| Model | | 1427.2735 2 713.636766 | R-squared = 0.5841 |
| Residual | | 1016.1859 71 14.3124778 | Adj R-squared = 0.5724 |
| | | | Root MSE = 3.783184 |
| Total | | 2443.4595 73 33.4720474 | Res. dev. = 403.8641 |
|
| mpg2 | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | | |
| /b | | .8350376 .1058498 7.89 0.000 .6239791 1.046096 |
| /c | | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326 |
| /lna | | -7.535992 .2959177 -25.47 0.000 -8.126035 -6.945949 |
|
Note: Parameter c is used as a constant term during estimation.
.
margins, predict(parameters(a))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .0005335 .0001579 3.38 0.001 .0002241 .000843 |
|
The output shows the parameter lna. To recover
a, we can use the nlcom command; we can
always call nl (or any estimation command) with the
coeflegend option to see how to refer to the parameters in
further expressions.
. nl, coeflegend
| Source | | SS df MS | |
| | | | Number of obs = 74 |
| Model | | 1427.2735 2 713.636766 | R-squared = 0.5841 |
| Residual | | 1016.1859 71 14.3124778 | Adj R-squared = 0.5724 |
| | | | Root MSE = 3.783184 |
| Total | | 2443.4595 73 33.4720474 | Res. dev. = 403.8641 |
|
| mpg2 | | Coefficient Legend |
| | | |
| /b | | .8350376 _b[/b] |
| /c | | -57.69477 _b[/c] |
| /lna | | -7.535992 _b[/lna] |
|
Note: Parameter c is used as a constant term during estimation.
.
nlcom a: exp(_b[lna:_cons])
a: exp(_b[lna:_cons])
|
| mpg2 | | Coefficient Std. err. z P>|z| [95% conf. interval] |
| | | |
| a | | .0005335 .0001579 3.38 0.001 .0002241 .000843 |
|
Example 2: Constraints of the form 0 < a < 1
Because the range of the inverse logit function is the interval (0,1), we
can use the Stata function invlogit() to set this
restriction.
. nl (mpg2 = {a:}*price + {b}*turn + {c}), define(a: invlogit({lgta}))
Iteration 0: Residual SS = 8.711e+08
Iteration 1: Residual SS = 71157.768
Iteration 2: Residual SS = 3521.3845
Iteration 3: Residual SS = 1029.0901
Iteration 4: Residual SS = 1016.2178
Iteration 5: Residual SS = 1016.1859
Iteration 6: Residual SS = 1016.1859
| Source | | SS df MS | |
| | | | Number of obs = 74 |
| Model | | 1427.2735 2 713.636766 | R-squared = 0.5841 |
| Residual | | 1016.1859 71 14.3124778 | Adj R-squared = 0.5724 |
| | | | Root MSE = 3.783184 |
| Total | | 2443.4595 73 33.4720474 | Res. dev. = 403.8641 |
|
| mpg2 | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | | |
| /b | | .8350376 .1058498 7.89 0.000 .6239791 1.046096 |
| /c | | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326 |
| /lgta | | -7.535459 .2960757 -25.45 0.000 -8.125817 -6.945101 |
|
Note: Parameter c is used as a constant term during estimation.
.
margins, predict(parameters(a))
warning: prediction constant over observations.
Predictive margins Number of obs = 74
Model VCE: GNR
Expression: Parameter a, predict(parameters(a))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .0005335 .0001579 3.38 0.001 .0002241 .000843 |
|
Example 3: Constraints of the form -1 < a < 1
We can use the hyperbolic tangent function for constraints like this.
. nl (mpg2 = {a:}*price + {b}*turn + {c}), define(a: tanh({atanha}))
Iteration 0: Residual SS = 36008
Iteration 1: Residual SS = 1016.186
Iteration 2: Residual SS = 1016.1859
| Source | | SS df MS | |
| | | | Number of obs = 74 |
| Model | | 1427.2735 2 713.636766 | R-squared = 0.5841 |
| Residual | | 1016.1859 71 14.3124778 | Adj R-squared = 0.5724 |
| | | | Root MSE = 3.783184 |
| Total | | 2443.4595 73 33.4720474 | Res. dev. = 403.8641 |
|
| mpg2 | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | | |
| /atanha | | .0005335 .0001579 3.38 0.001 .0002187 .0008483 |
| /b | | .8350376 .1058498 7.89 0.000 .6239791 1.046096 |
| /c | | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326 |
|
Note: Parameter c is used as a constant term during estimation.
.
margins, predict(parameters(a))
warning: prediction constant over observations.
Predictive margins Number of obs = 74
Model VCE: GNR
Expression: Parameter a, predict(parameters(a))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .0005335 .0001579 3.38 0.001 .0002241 .000843 |
|
Example 4: Constraints of the form 0 < a < b
We can express a as an exponential, as in Example 1, to
ensure that it will be positive. In addition, we want to set the restriction
b>a; therefore, we can also express the difference
b−a as an exponential.
. nl (mpg2 = {a:}*price + {b:}*turn + {c}), define(a:exp({lna})) define(b:
exp({lna})+ exp({lndiff}))
Iteration 0: Residual SS = 3.540e+09
Iteration 1: Residual SS = 2848033.8
Iteration 2: Residual SS = 54058.852
Iteration 3: Residual SS = 1196.8817
Iteration 4: Residual SS = 1045.7428
Iteration 5: Residual SS = 1016.206
Iteration 6: Residual SS = 1016.1859
Iteration 7: Residual SS = 1016.1859
| Source | | SS df MS | |
| | | | Number of obs = 74 |
| Model | | 1427.2735 2 713.636766 | R-squared = 0.5841 |
| Residual | | 1016.1859 71 14.3124778 | Adj R-squared = 0.5724 |
| | | | Root MSE = 3.783184 |
| Total | | 2443.4595 73 33.4720474 | Res. dev. = 403.8641 |
|
| mpg2 | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | | |
| /c | | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326 |
| /lna | | -7.535992 .2959177 -25.47 0.000 -8.126035 -6.945949 |
| /lndiff | | -.1809177 .1269003 -1.43 0.158 -.4339496 .0721142 |
|
Note: Parameter c is used as a constant term during estimation.
.
margins, predict(parameters(a))
warning: prediction constant over observations.
Predictive margins Number of obs = 74
Model VCE: GNR
Expression: Parameter a, predict(parameters(a))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .0005335 .0001579 3.38 0.001 .0002241 .000843 |
|
.
margins, predict(parameters(b))
warning: prediction constant over observations.
Predictive margins Number of obs = 74
Model VCE: GNR
Expression: Parameter b, predict(parameters(b))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .8350376 .1058498 7.89 0.000 .6275758 1.042499 |
|
Example 5: Constraints of the form k1.a < k2.b < k2.c
The concepts in Example 4 can be extended to similar cases, like
a<b<c or a<b<2c.
For example, let's assume that we want to fit the regression
nl (turn = {a}*headroom + {b}*displacement + {c})
with the constraints 0.5a<10b<2c.
These two inequalities can be presented as
10b - 0.5a > 0
2c - 10b > 0
and we can express the left-hand sides of these as
exponentials to ensure that they will turn out positive.
We can then estimate the two following parameters
d1 = exp(lnd1) = 10b - 0.5a
d2 = exp(lnd2) = 2c - 10b
which imply that we can substitute b and c
as follows:
b = 0.1(d1 + 0.5a)
c = 0.5(d2 + 10b)
Hence, our command line would look like this:
. nl (turn = {a}*headroom +{b:}*displacement +{c:}), define(b: 0.1*(exp
({lnd1}) +0.5*{a})) define(c: 0.5*(exp({lnd2})+exp({lnd1})+0.5*{b:}))
Iteration 0: Residual SS = 29404.326
Iteration 1: Residual SS = 3726.5925
Iteration 2: Residual SS = 2338.4815
Iteration 3: Residual SS = 1306.567
Iteration 4: Residual SS = 556.73518
Iteration 5: Residual SS = 554.69695
Iteration 6: Residual SS = 554.69685
Iteration 7: Residual SS = 554.69685
| Source | | SS df MS | |
| | | | Number of obs = 74 |
| Model | | 858.16801 2 429.084007 | R-squared = 0.6074 |
| Residual | | 554.69685 71 7.8126317 | Adj R-squared = 0.5963 |
| | | | Root MSE = 2.795109 |
| Total | | 1412.8649 73 19.3543132 | Res. dev. = 359.0653 |
|
| turn | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | | |
| /a | | .3751308 .4392973 0.85 0.396 -.5008031 1.251065 |
| /lnd1 | | -1.782972 1.436274 -1.24 0.219 -4.646822 1.080878 |
| /lnd2 | | 4.14043 .0361865 114.42 0.000 4.068276 4.212583 |
|
Note: Parameter lnd2 is used as a constant term during estimation.
.
margins, predict(parameters(b))
warning: prediction constant over observations.
Predictive margins Number of obs = 74
Model VCE: GNR
Expression: Parameter b, predict(parameters(b))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .0355703 .0040468 8.79 0.000 .0276388 .0435018 |
|
.
margins, predict(parameters(c))
warning: prediction constant over observations.
Predictive margins Number of obs = 74
Model VCE: GNR
Expression: Parameter c, predict(parameters(c))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | 31.50786 1.214814 25.94 0.000 29.12687 33.88886 |
|
If we want to construct a table with the three parameters, we can use the collect prefix, as follows:
collect clear
collect: nl (turn = {a}*headroom +{b:}*displacement +{c:}),
define(b: 0.1*(exp({lnd1})+0.5*{a}))
define(c: 0.5*(exp({lnd2})+exp({lnd1})+0.5*{b:}))
collect: margins, predict(parameters(b))
collect: margins, predict(parameters(c))
collect label levels cmdset 1 "a", modify
collect label levels cmdset 2 "b", modify
collect label levels cmdset 3 "c", modify
collect style header cmdset, level(label)
collect style header colname, level(hide)
collect layout (colname[a]#cmdset[1] colname[_cons]#cmdset[2] colname[_cons]#cmdset[3])
(result[_r_b _r_se _r_z _r_p _r_ci])
Example 6: Setting a model where parameters are proportions
Finally, let’s see how to fit a model where the coefficients are
proportions; that is, they are all positive and add up to one.
We will fit the linear model
y = a1*x1 + a2*x2 + a3*x3 + a4 + ε
where a1, a2, and a3 are
positive, and a1 + a2 + a3 = 1.
We will use the transformation implemented in the Stata command
mlogit:
a2 = exp(t2)/(1+exp(t2)+exp(t3))
a3 = exp(t3)/(1+exp(t2)+exp(t3))
a1 = 1/(1+exp(t2)+exp(t3))
Here we illustrate the concept with simulated data:
. clear
. set seed 12345
. set obs 1000
Number of observations (_N) was 0, now 1,000.
. generate x1 = invnormal(runiform())
. generate x2 = invnormal(runiform())
. generate x3 = invnormal(runiform())
. generate ep = invnormal(runiform())
. generate y = .2*x1 + .5*x2 + .3*x3 + 1 + ep
Although the actual coefficients used for the simulation add up to one, the
estimates obtained with regress most likely will not.
. nl (y = {a:}*x1 + {b:}*x2 + {c:}*x3 + {d}), define(b: exp({t2})/(1+exp
({t2})+exp({t3}))) define(c: exp({t3})/(1+exp({t2})+exp({t3}))) define
(a: 1/(1+exp({t2})+exp({t3})))
| Source | | SS df MS | |
| | | Number of obs = 1,000 |
| Model | | 430.4658 2 215.232898 | R-squared = 0.3028 |
| Residual | | 991.16818 997 .99415063 | Adj R-squared = 0.3014 |
| | | Root MSE = .997071 |
| Total | | 1421.634 999 1.42305703 | Res. dev. = 2829.006 |
|
| y | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | |
| /d | | .9913625 .0315356 31.44 0.000 .9294787 1.053246 |
| /t2 | | .7524281 .1506533 4.99 0.000 .4567941 1.048062 |
| /t3 | | .2617824 .1748548 1.50 0.135 -.0813433 .6049081 |
|
Note: Parameter d is used as a constant term during estimation.
.
margins, predict(parameters(a))
warning: prediction constant over observations.
Predictive margins Number of obs = 1,000
Model VCE: GNR
Expression: Parameter a, predict(parameters(a))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .2261732 .0259945 8.70 0.000 .175225 .2771214 |
|
.
margins, predict(parameters(b))
warning: prediction constant over observations.
Predictive margins Number of obs = 1,000
Model VCE: GNR
Expression: Parameter b, predict(parameters(b))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .4799727 .0262525 18.28 0.000 .4285188 .5314266 |
|
.
margins, predict(parameters(c))
warning: prediction constant over observations.
Predictive margins Number of obs = 1,000
Model VCE: GNR
Expression: Parameter c, predict(parameters(c))
|
| | | Delta-method |
| | | Margin std. err. z P>|z| [95% conf. interval] |
| | | |
| _cons | | .2938541 .025686 11.44 0.000 .2435104 .3441978 |
|
Let’s fit the model by setting the restrictions using
nl:
. local ma2 (exp({t2})/(1+exp({t2})+exp({t3})))
. local ma3 (exp({t3})/(1+exp({t2})+exp({t3})))
. local ma1 (1/(1+exp({t2})+exp({t3})))
. nl (y = `ma1'*x1 + `ma2'*x2 + `ma3'*x3 + {a4}), delta(1e-7) nolog
note: option delta(#) has been deprecated, use option epsilon(#) instead.
| Source | | SS df MS | |
| | | Number of obs = 1,000 |
| Model | | 430.4658 2 215.232898 | R-squared = 0.3028 |
| Residual | | 991.16818 997 .99415063 | Adj R-squared = 0.3014 |
| | | Root MSE = .997071 |
| Total | | 1421.634 999 1.42305703 | Res. dev. = 2829.006 |
|
| y | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | |
| /d | | .9913625 .0315356 31.44 0.000 .9294787 1.053246 |
| /t2 | | .7524281 .1506533 4.99 0.000 .4567941 1.048062 |
| /t3 | | .2617824 .1748548 1.50 0.135 -.0813433 .6049081 |
|
Note: Parameter d is used as a constant term during estimation.
. local na2 exp(_b[t2:_cons])/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
. local na3 exp(_b[t3:_cons])/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
. local na1 1/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
. nlcom (a1: `na1') (a2: `na2') (a3: `na3')
a1: 1/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
a2: exp(_b[t2:_cons])/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
a3: exp(_b[t3:_cons])/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
|
| y | | Coefficient Std. err. z P>|z| [95% conf. interval] |
| | | |
| a1 | | .2261732 .0259945 8.70 0.000 .175225 .2771214 |
| a2 | | .4799727 .0262525 18.28 0.000 .4285188 .5314266 |
| a3 | | .2938541 .025686 11.44 0.000 .2435104 .3441978 |
|
To display a table with the coefficients, we can use the collect
prefix with margins. After fitting the model in Example 6 with nl,
you can write:
cscript
set seed 12345
set obs 1000
generate x1 = invnormal(runiform())
generate x2 = invnormal(runiform())
generate x3 = invnormal(runiform())
generate ep = invnormal(runiform())
generate y = .2*x1 + .5*x2 + .3*x3 + 1 + ep
nl (y = {a:}*x1 + {b:}*x2 + {c:}*x3 + {d}), ///
define(b: exp({t2})/(1+exp({t2})+exp({t3}))) ///
define(c: exp({t3})/(1+exp({t2})+exp({t3}))) ///
define(a: 1/(1+exp({t2})+exp({t3})))
collect clear
collect: margins, predict(parameters(a))
collect: margins, predict(parameters(b))
collect: margins, predict(parameters(c))
collect label levels cmdset 1 "a", modify
collect label levels cmdset 2 "b", modify
collect label levels cmdset 3 "c", modify
collect style header cmdset, level(label)
collect style header colname, level(hide)
collect layout (colname[_cons]#cmdset[1] colname[_cons]#cmdset[2] colname[_cons]#cmdset[3])
(result[_r_b _r_se _r_z _r_p _r_ci])