RE: st: Constrained linear regression... is not linear?

From   Nick Cox <>
To   "''" <>
Subject   RE: st: Constrained linear regression... is not linear?
Date   Tue, 7 Dec 2010 18:41:22 +0000

The main problem with your initial formulation is that -constraint- will only accept linear constraints, but inequalities are not linear constraints. 

So, you need to use some other way of ensuring that your parameter value is between 0 and 1 and the standard trick is to parameterise indirectly using -invlogit()-.

I don't think you need to learn Stata programming, ignoramus or not. It seems to me you need most to study how to invoke -nl- and Austin is just providing an example. logit(.3) is an initial value for parameter search. 

The FAQ Maarten cited is a little bit of a red herring, as although it carries much useful detail, it is phrased in terms of -ml- and -nl- is more useful in this respect. 



> > kokootchke wrote:
> >> I am trying to run the following constrained linear
> >> regression:
> >> y = ax + (1-a)z, with a in [0,1]
> > 
> >> What I'm doing is the following:
> >> constraint define 1 x+z = 1
> >> constraint define 2 x >= 0
> >> constraint define 3 x <= 1
> >> cnsreg y x z, c(1-3)

> > Maarten buis wrote:
> > Constraints 2 and 3 are not allowed with -cnsreg-. The
> > problem is the fact that you want to constrain the parameter
> > within a certain range, and this is not considered to be
> > linear constraint. If you want to estimate this model you'll
> > have to use either -nl- or -ml- as is discussed here:

Austin Nichols wrote:> If you're getting an answer outside [0,1] then perhaps your model is
> incorrectly specified, and you should rethink it. That said, try:
> drawnorm x z e, n(1000) clear seed(1)
> g y=min(max(1,round(3+.3*x+.7*z+e)),5)
> g ylz=y-z
> constraint define 1 x+z = 1
> cnsreg y x z, c(1-3)
> nl (y={a}+{b}*x+(1-{b})*z)
> loc i=logit(.3)
> qui nl (y={a=0}+invlogit({b=`i'})*x+(1-invlogit({b=`i'}))*z)
> nlcom (invlogit([b]_cons)) (1-invlogit([b]_cons))
> qui nl (ylz={a=0}+invlogit({b=`i'})*(x-z))
> nlcom (invlogit([b]_cons)) (1-invlogit([b]_cons)), post
> test _b[_nl_1]=_b[_nl_2]

Dear Maarten and Austin, 
Thank you for your replies but I'm afraid they are not very helpful -- mainly because I don't understand the code. 
I have implemented Austin's code by changing the variable names y, x, z and it does give me reasonable numbers but I'd like to know what it's doing and there are still a few things I don't get. I see that in the first few lines you are simply creating variables and making sure y is some linear combination of those variables. What does logit(.3) do and why are you doing that? Because it's the coefficient for x above? I have changed this parameter and noticed it has no effect on the results afterwards, but I don't understand what the nl command is doing with this value for the local variable i, and what the nlcom command is doing afterwards.
Maarten, I have read the information in the link you sent but I must say that it's beyond my programming capabilities. I'm not a complete ignorant when it comes to programming and I have programmed in C and Fortran before... so perhaps you could recommend a few links/books that could teach me how to do this sort of programming in Stata?
Thank you once again.Adrian

