How do I fit a linear regression with interval (inequality) constraints in Stata?
|
Title
|
|
Fitting a linear regression with interval (inequality) constraints using nl
|
|
Author
|
Isabel Canette, StataCorp
|
|
Date
|
October 2011
|
If you need to fit a linear model with linear constraints, you can use the
Stata command
cnsreg.
If you need to fit a nonlinear
model with interval constraints, you can use the
ml
command, as explained at
stata.com/support/faqs/statistics/regression-with-interval-constraints.
However, if you have a linear regression, the simplest way to include these
kinds of constraints is by using the
nl command.
Introduction
First, let's review how to fit a linear regression using
nl. We will use this command to fit a regression of
mpg2 on price and turn:
. sysuse auto
(1978 Automobile Data)
. generate mpg2 = -mpg
. nl (mpg2 = {a}*price + {b}*turn + {c})
(obs = 74)
Iteration 0: residual SS = 1016.186
Iteration 1: residual SS = 1016.186
Source | SS df MS
-------------+------------------------------ Number of obs = 74
Model | 1427.27353 2 713.636766 R-squared = 0.5841
Residual | 1016.18593 71 14.3124778 Adj R-squared = 0.5724
-------------+------------------------------ Root MSE = 3.783184
Total | 2443.45946 73 33.4720474 Res. dev. = 403.8641
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/a | .0005335 .0001579 3.38 0.001 .0002187 .0008483
/b | .8350376 .1058498 7.89 0.000 .6239791 1.046096
/c | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326
------------------------------------------------------------------------------
Parameter c taken as constant term in model & ANOVA table
We will set inequality constraints (and interval constraints) via
transformations. For example, let's assume that we want parameter
a to be positive. This can be achieved by expressing
a as an exponential. We can, therefore, estimate
lna = ln(a) and then recover a = exp(lna)
after the estimation. The trick is to use a transformation whose
range is the interval over which we want to restrict the parameter.
Type
. help math functions
to see the mathematical functions available in Stata.
There are many ways to set interval constraints. The following examples show
some possibilities.
Example 1: Constraints of the form a > 0
As stated before, we will estimate ln(a) instead of
a.
. nl (mpg2 = exp({lna})*price + {b}*turn + {c}), nolog
(obs = 74)
Source | SS df MS
-------------+------------------------------ Number of obs = 74
Model | 1427.27353 2 713.636766 R-squared = 0.5841
Residual | 1016.18593 71 14.3124778 Adj R-squared = 0.5724
-------------+------------------------------ Root MSE = 3.783184
Total | 2443.45946 73 33.4720474 Res. dev. = 403.8641
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/lna | -7.535992 .2959172 -25.47 0.000 -8.126034 -6.94595
/b | .8350376 .1058498 7.89 0.000 .6239791 1.046096
/c | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326
------------------------------------------------------------------------------
Parameter c taken as constant term in model & ANOVA table
The output shows the parameter lna. To recover
a, we can use the nlcom command; we can
always call nl (or any estimation command) with the
coeflegend option to see how to refer to the parameters in
further expressions.
. nl, coeflegend
Source | SS df MS
-------------+------------------------------ Number of obs = 74
Model | 1427.27353 2 713.636766 R-squared = 0.5841
Residual | 1016.18593 71 14.3124778 Adj R-squared = 0.5724
-------------+------------------------------ Root MSE = 3.783184
Total | 2443.45946 73 33.4720474 Res. dev. = 403.8641
------------------------------------------------------------------------------
mpg2 | Coef. Legend
-------------+----------------------------------------------------------------
/lna | -7.535992 _b[lna:_cons]
/b | .8350376 _b[b:_cons]
/c | -57.69477 _b[c:_cons]
------------------------------------------------------------------------------
Parameter c taken as constant term in model & ANOVA table
. nlcom a: exp(_b[lna:_cons])
a: exp(_b[lna:_cons])
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
a | .0005335 .0001579 3.38 0.001 .0002187 .0008483
------------------------------------------------------------------------------
Example 2: Constraints of the form 0 < a < 1
Because the range of the inverse logit function is the interval (0,1), we
can use the Stata function invlogit() to set this
restriction.
. nl (mpg2 = invlogit({lgta})*price + {b}*turn + {c}), nolog
(obs = 74)
Source | SS df MS
-------------+------------------------------ Number of obs = 74
Model | 1427.27353 2 713.636766 R-squared = 0.5841
Residual | 1016.18593 71 14.3124778 Adj R-squared = 0.5724
-------------+------------------------------ Root MSE = 3.783184
Total | 2443.45946 73 33.4720474 Res. dev. = 403.8641
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/lgta | -7.535459 .2960752 -25.45 0.000 -8.125816 -6.945101
/b | .8350376 .1058498 7.89 0.000 .6239791 1.046096
/c | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326
------------------------------------------------------------------------------
Parameter c taken as constant term in model & ANOVA table
. nlcom a: invlogit(_b[lgta:_cons])
a: invlogit(_b[lgta:_cons])
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
a | .0005335 .0001579 3.38 0.001 .0002187 .0008483
------------------------------------------------------------------------------
Example 3: Constraints of the form -1 < a < 1
We can use the hyperbolic tangent function for constraints like this.
. nl (mpg2 = tanh({atanha})*price + {b}*turn + {c}), nolog
(obs = 74)
Source | SS df MS
-------------+------------------------------ Number of obs = 74
Model | 1427.27353 2 713.636766 R-squared = 0.5841
Residual | 1016.18593 71 14.3124778 Adj R-squared = 0.5724
-------------+------------------------------ Root MSE = 3.783184
Total | 2443.45946 73 33.4720474 Res. dev. = 403.8641
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/atanha | .0005335 .0001579 3.38 0.001 .0002187 .0008483
/b | .8350376 .1058498 7.89 0.000 .6239791 1.046096
/c | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326
------------------------------------------------------------------------------
Parameter c taken as constant term in model & ANOVA table
. nlcom a: tanh(_b[atanha:_cons])
a: tanh(_b[atanha:_cons])
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
a | .0005335 .0001579 3.38 0.001 .0002187 .0008483
------------------------------------------------------------------------------
Example 4: Constraints of the form 0 < a < b
We can express a as an exponential, as in Example 1, to
ensure that it will be positive. In addition, we want to set the restriction
b>a; therefore, we can also express the difference
b−a as an exponential.
. nl (mpg2 = exp({lna})*price + (exp({lndiff})+exp({lna}))*turn + {c}), nolog
(obs = 74)
Source | SS df MS
-------------+------------------------------ Number of obs = 74
Model | 1427.27353 2 713.636766 R-squared = 0.5841
Residual | 1016.18593 71 14.3124778 Adj R-squared = 0.5724
-------------+------------------------------ Root MSE = 3.783184
Total | 2443.45946 73 33.4720474 Res. dev. = 403.8641
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/lna | -7.535992 .2959172 -25.47 0.000 -8.126035 -6.94595
/lndiff | -.1809177 .1269002 -1.43 0.158 -.4339496 .0721142
/c | -57.69477 4.027951 -14.32 0.000 -65.72627 -49.66326
------------------------------------------------------------------------------
Parameter c taken as constant term in model & ANOVA table
. nlcom (a: exp(_b[lna:_cons])) (b: exp(_b[lna:_cons])+exp(_b[lndiff:_cons]))
a: exp(_b[lna:_cons])
b: exp(_b[lna:_cons])+exp(_b[lndiff:_cons])
------------------------------------------------------------------------------
mpg2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
a | .0005335 .0001579 3.38 0.001 .0002187 .0008483
b | .8350376 .1058498 7.89 0.000 .6239791 1.046096
------------------------------------------------------------------------------
Example 5: Constraints of the form k1.a < k2.b < k2.c
The concepts in Example 4 can be extended to similar cases, like
a<b<c or a<b<2c.
For example, let's assume that we want to fit the regression
nl (turn = {a}*headroom + {b}*displacement + {c})
with the constraints 0.5a<10b<2c.
These two inequalities can be presented as
10b - 0.5a > 0
2c - 10b > 0
and we can express the left-hand sides of these as
exponentials to ensure that they will turn out positive.
We can then estimate the two following parameters
d1 = exp(lnd1) = 10b - 0.5a
d2 = exp(lnd2) = 2c - 10b
which imply that we can substitute b and c
as follows:
b = 0.1(d1 + 0.5a)
c = 0.5(d2 + 10b)
Hence, our command line would look like this:
. nl (turn = {a}*headroom + 0.1*(exp({lnd1})+0.5*{a})*displacement ///
> + 0.5*(exp({lnd2}) + 10*0.1*(exp({lnd1})+0.5*{a}))), nolog
(obs = 74)
Source | SS df MS
-------------+------------------------------ Number of obs = 74
Model | 858.168014 2 429.084007 R-squared = 0.6074
Residual | 554.696851 71 7.8126317 Adj R-squared = 0.5963
-------------+------------------------------ Root MSE = 2.795109
Total | 1412.86486 73 19.3543132 Res. dev. = 359.0653
------------------------------------------------------------------------------
turn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/a | .3751308 .4392973 0.85 0.396 -.5008032 1.251065
/lnd1 | -1.782972 1.436274 -1.24 0.219 -4.64682 1.080877
/lnd2 | 4.137724 .0388728 106.44 0.000 4.060214 4.215234
------------------------------------------------------------------------------
Parameter lnd2 taken as constant term in model & ANOVA table
. nlcom (a: _b[a:_cons]) ///
> (b: 0.1*(exp(_b[lnd1:_cons])+0.5*_b[a:_cons])) ///
> (c: 0.5*(exp(_b[lnd2:_cons]) + ///
> exp(_b[lnd1:_cons])+0.5*_b[a:_cons]))
a: _b[a:_cons]
b: 0.1*(exp(_b[lnd1:_cons])+0.5*_b[a:_cons] )
c: 0.5*(exp(_b[lnd2:_cons]) + exp(_b[lnd1:_cons])+0.5*_b[a:_cons])
------------------------------------------------------------------------------
turn | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
a | .3751308 .4392973 0.85 0.396 -.5008032 1.251065
b | .0355703 .0040468 8.79 0.000 .0275013 .0436393
c | 31.50786 1.214813 25.94 0.000 29.0856 33.93013
------------------------------------------------------------------------------
Example 6: Setting a model where parameters are proportions
Finally, let’s see how to fit a model where the coefficients are
proportions; that is, they are all positive and add up to one.
We will fit the linear model
y = a1*x1 + a2*x2 + a3*x3 + a4 + ε
where a1, a2, and a3 are
positive, and a1 + a2 + a3 = 1.
We will use the transformation implemented in the Stata command
mlogit:
a2 = exp(t2)/(1+exp(t2)+exp(t3))
a3 = exp(t3)/(1+exp(t2)+exp(t3))
a1 = 1/(1+exp(t2)+exp(t3))
Here we illustrate the concept with simulated data:
. clear
. set seed 12345
. set obs 1000
obs was 0, now 1000
. generate x1 = invnormal(runiform())
. generate x2 = invnormal(runiform())
. generate x3 = invnormal(runiform())
. generate ep = invnormal(runiform())
. generate y = .2*x1 + .5*x2 + .3*x3 + 1 + ep
Although the actual coefficients used for the simulation add up to one, the
estimates obtained with regress most likely will not.
. regress y x1 x2 x3
Source | SS df MS Number of obs = 1000
-------------+------------------------------ F( 3, 996) = 111.86
Model | 351.109476 3 117.036492 Prob > F = 0.0000
Residual | 1042.09494 996 1.04628006 R-squared = 0.2520
-------------+------------------------------ Adj R-squared = 0.2498
Total | 1393.20441 999 1.39459901 Root MSE = 1.0229
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | .1815533 .0322495 5.63 0.000 .1182685 .2448382
x2 | .5008877 .0328364 15.25 0.000 .4364512 .5653243
x3 | .2909175 .0324558 8.96 0.000 .2272279 .3546072
_cons | 1.038042 .0323758 32.06 0.000 .9745092 1.101575
------------------------------------------------------------------------------
. display "sum of coefficients = " _b[x1] + _b[x2] + _b[x3]
sum of coefficients = .97335864
Let’s fit the model by setting the restrictions using
nl:
. local ma2 (exp({t2})/(1+exp({t2})+exp({t3})))
. local ma3 (exp({t3})/(1+exp({t2})+exp({t3})))
. local ma1 (1/(1+exp({t2})+exp({t3})))
. nl (y = `ma1'*x1 + `ma2'*x2 + `ma3'*x3 + {a4}), delta(1e-7) nolog
(obs = 1000)
Source | SS df MS
-------------+------------------------------ Number of obs = 1000
Model | 350.884044 2 175.442022 R-squared = 0.2519
Residual | 1042.32037 997 1.04545674 Adj R-squared = 0.2504
-------------+------------------------------ Root MSE = 1.022476
Total | 1393.20441 999 1.39459901 Res. dev. = 2879.326
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/t2 | .9853402 .1688961 5.83 0.000 .6539076 1.316773
/t3 | .4539762 .1955981 2.32 0.020 .0701451 .8378073
/a4 | 1.03804 .0323631 32.07 0.000 .9745326 1.101548
------------------------------------------------------------------------------
Parameter a4 taken as constant term in model & ANOVA table
. local na2 exp(_b[t2:_cons])/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
. local na3 exp(_b[t3:_cons])/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
. local na1 1/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
. nlcom (a1: `na1') (a2: `na2') (a3: `na3')
a1: 1/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
a2: exp(_b[t2:_cons])/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
a3: exp(_b[t3:_cons])/(1+exp(_b[t2:_cons])+exp(_b[t3:_cons]))
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
a1 | .1903571 .0260724 7.30 0.000 .139194 .2415203
a2 | .509914 .0264489 19.28 0.000 .4580122 .5618159
a3 | .2997288 .0263153 11.39 0.000 .2480891 .3513685
------------------------------------------------------------------------------
If you find convergence issues when using nl to solve these
problems, you might try using ml instead, as explained in
stata.com/support/faqs/statistics/regression-with-interval-constraints.
Using ml would allow you to specify analytical derivatives
and to have better control of your optimization process.
|