Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: Simultaneous Ordered Probit and Linear Equation Model, Unbalanced Data, ML Programming

 From Hendrik M To Statalist Subject st: Simultaneous Ordered Probit and Linear Equation Model, Unbalanced Data, ML Programming Date Wed, 21 Sep 2011 15:54:39 +0200

```Dear all,

I try to estimate an ordered probit model and a linear equation simultaneously with unbalanced data, but I face some problems.
I use Stata 11.1

The model:

1. Latent variable model (ordered probit)
y1 = x1 + beta1*x2 + err1

y1 = {0,1,2,3}
coefficient of x1 is constraint to 1.

2. Linear equation model

2.a. Constant elasticity specification
y2 = gamma1*x2 + gamma2*ln(y1) + err2

or

2.b. Fixed Effects specification
y2 = gamma1*x2 + gamma2*d(y1=1) + gamma3*d(y1=2) + gamma4(y1=3) + err2

Data availability:

y1 is a variable with discrete values and has, for example, 1000 obs.
y2 is a continuous variable, but only has observation if y1>=1, so let's say 800 obs.

The Question:

As
you can see, y2 depends on y1, either through a constant elasticity
specification (2.a.) or fixed effects specification (2.b.).
You obtain a joint likelihood, and both equations contribute to that Likelihood function.

If
I programme the function, Stata keeps on refusing to use all 1000
observations. It only uses 800 (all obs for which also y2 is availbale).
This leads to a biased estimate of the first cutpoint because the first
cutpoint is estimated with the 1000 and 800 obs. It actually leads to
biased estimates of all coefficients since they are conditional on
each other.

Can anyone tell me, why Stata only uses 800 obs
instead of 1000? Or is there a trick to force Stata to ignore that there
are missing observations for the y2?

I already tried to replace the missing y2 with 0, but that would change the nature of the model.

Furthermore, Stata has difficulties to achieve convergence, which might be caused by the failure of computing the Hessian.
I guess it might be also as initila value problem present but I used
the results of separate estimation of the OProbit and linear regression
to give Stata suitbale initial values, which is in my opinion the best
way to do, but I would be happy about otehr sugestions.  Those two
problems are certainly secondary.

Might help to clarify:

I programmed following likelihood contributions (sorry for the mess, but it is kind of complicated):

for y1=0   --->   P(0) = 1 - normprob[(lambda1*x2+theta1)/sigma_err1]

for
y1>=1   --->   f(y2)P(y1|y2) =
1/sigma_err2*normalden(err2/sigma_err2) * [
normprob[(lambda1*x2+thetaN-sigma_err1_err2/sigma_err2^2*err2)/sqrt(sigma_err1^2-sigma_err1_err2^2/sigma_err2)]
-
normprob[(lambda1*x2+theta(N+1)sigma_err1_err2/sigma_err2^2*err2)/sqrt(sigma_err1^2-sigma_err1_err2^2/sigma_err2)]
]

using some definitions:
lambda1=gamma1+beta1

thetaN is a "cut-point"

The Likelihood function in Stata (given the fixed effects specification):

capture program drop simultaneous_full
program define simultaneous_full
args lnf bx1 theta1 theta2 theta3 bx2 sigmaR sigmaE sigmaER

qui replace `lnf' = ln(1-normal((`theta1'-x1-`bx1')/`sigmaE'))    if \$ML_y1==0

qui replace `lnf' =
-ln(`sigmaR')+ln(normalden((\$ML_y2-`bx2')/`sigmaR')) +
ln(normal((-`bx1'-x1+`thetaE2'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2))-normal((-`bx1'-x1+`thetaE1'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2)))

if \$ML_y1==1
qui replace `lnf' =
-ln(`sigmaR')+ln(normalden((\$ML_y2-`bx2')/`sigmaR')) +
ln(normal((-`bx1'-x1+`thetaE3'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2))-normal((-`bx1'-x1+`thetaE2'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2)))

if \$ML_y1==2
qui replace `lnf' =
-ln(`sigmaR')+ln(normalden((\$ML_y2-`bx2')/`sigmaR')) +
ln(1-normal((-`bx1'-x1+`thetaE3'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2)))
if \$ML_y1==3

end

Further references (if interested), please see:
Schaumans, Catherine & Verboven, Frank, 2011, "Enrty and Competition in Differentiated Product Markets", CEPR Discussion Papers
8353, C.E.P.R. Discussion Papers

or
Ferrari & Verboven
& Degryse, "Investment and Usuage of New Technologies: Evidence from
a Shared ATM Network", american Economic Review, Vol. 100(3), 1046-79,
June

I
post a copy of my do-file in the end which simulates data and estimates
the described model(s) up to a certain extent. It might be useful just
to copy it in a do-file and let it run. Then the problem I'm facing
should be much clearer if you have a look at the number of observations.
Just a quick warning: It might be necessary to re-run it several times
(3-4), since the data generating process might lead to non-converging or
discontinuous regions, which ML maximize refuses to optimize.

I
sincerely apologize for the very complex representation of my problem,
but it is a very tricky likelihood function and estimation.

Help and time you spend on this problem will be much appreciated.

Happy to discuss
Best regards
Hendrik

Following the CODE:

clear
clear matrix
clear mata
set mem 100m
set more off
capture log close

/*------------------------
Data Simulation
------------------------*/

set obs 1000

* generate regressors
gen x1_aux = 10 + 10*invnorm(uniform())^2
gen x2 = uniform()

* generate y1 and corresponding error err1

gen err1 = invnorm(uniform())
gen y1 = x1_aux - 10*x2 + err1

* now we assume that we only observe ordinal data
sum y1, d

recode y1 min/9=0  9/18=1 18/32=2 32/max=3
tab y1

* gen dummies for categorial variable y1
tab y1, gen(dy1_)

rename dy1_1 dy1_0
rename dy1_2 dy1_1
rename dy1_3 dy1_2
rename dy1_4 dy1_3

* generate data for linear equation
gen err2 = invnorm(uniform())

gen y2_aux = (10*x2 + 10*y1 + err2)/y1 if y1>=1

gen y2=ln(y2_aux)

gen x1=ln(x1_aux)

drop x1_ y2_

/*------------------------

The Estimation

------------------------*/

*** Simple Ordered Probit, Ordered Outcome with 4 categories, y1 ***

oprobit y1 x1 x2

capture program drop myoprob

program define myoprob
qui    {
args lnf bx1 theta1 theta2 theta3
replace `lnf' = ln(normprob(`theta1'-x1-`bx1'))                        if \$ML_y1==0
replace `lnf' = ln(normprob(`theta2'-x1-`bx1')-normprob(`theta1'-x1-`bx1')) if \$ML_y1==1
replace `lnf' = ln(normprob(`theta3'-x1-`bx1')-normprob(`theta2'-x1-`bx1'))    if \$ML_y1==2
replace `lnf' = ln(1-normprob(`theta3'-x1-`bx1'))    if \$ML_y1==3

}
end

* and estimate the model
ml model lf myoprob (OProbit: y1 = x2, nocons) ///
(Cut1: ) (Cut2: ) (Cut3: )

ml search

ml maximize
matrix initialC = e(b)

*** Linear equation, y2, fixed effects specification ***

reg y2 x2 dy1_1 dy1_2 dy1_3

* Estimating linear equation by hand

capture program drop linear1
program define linear1
qui {
args lnf bx2 sigma_err2
replace `lnf' = -ln(`sigma_err2')+ln(normalden((\$ML_y1-`bx2')/`sigma_err2')) if \$ML_y1!=.
}
end

* Model: 1. Structural Equation; 2. Sigma

ml model lf linear1 (Linear: y2 = x2 dy1_1 dy1_2 dy1_3) (Sigma_err2:  )

* Starting values
ml search
* Go!
ml maximize

matrix initialB = e(b)

* The simple program, estimating only OProbit and Linear, sigmaR and sigmaE given, sigmaER zero

capture program drop simultaneous1
program define simultaneous1
qui {
args lnf bx1 bx2 theta1 theta2 theta3 sigma_err2

qui replace `lnf' = ln(1-normprob(`theta1'-x1-`bx1'))    if \$ML_y1==0

qui replace `lnf' =
-ln(`sigma_err2')+ln(normalden((\$ML_y2-`bx2')/`sigma_err2'))+ln(normprob(`theta2'-x1-`bx1')-normprob(`theta1'-x1-`bx1')
)   if \$ML_y1==1
qui replace `lnf' =
-ln(`sigma_err2')+ln(normalden((\$ML_y2-`bx2')/`sigma_err2'))+ln(normprob(`theta3'-x1-`bx1')-normprob(`theta2'-x1-`bx1')
)   if \$ML_y1==2
qui replace `lnf' =
-ln(`sigma_err2')+ln(normalden((\$ML_y2-`bx2')/`sigma_err2'))+ln(1-normprob(`theta3'-x1-`bx1')
)   if \$ML_y1==3
}
end

ml model lf simultaneous1 (Oprobit: y1 = x2, noconst) ///
(Linear: y2 = x2 dy1_1 dy1_2 dy1_3) ///
(theta1: ) (theta2: ) (theta3: ) (sigma_err2: )

ml check

ml search

ml maximize, difficult iter(50)

matrix initialA = e(b)
matrix initial_sigma_err1 = (-1)
matrix initial_sigma_err1_err2 = (0.2)

*** Simultaneous Model full model ***

capture program drop simultaneous_full
program define simultaneous_full
args lnf bx1 theta1 theta2 theta3 bx2 sigmaR sigmaE sigmaER

qui replace `lnf' = ln(1-normal((`theta1'-x1-`bx1')/`sigmaE'))    if \$ML_y1==0

qui replace `lnf' =
-ln(`sigmaR')+ln(normalden((\$ML_y2-`bx2')/`sigmaR')) +
ln(normal((-`bx1'-x1+`thetaE2'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2))-normal((-`bx1'-x1+`thetaE1'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2)))

if \$ML_y1==1
qui replace `lnf' =
-ln(`sigmaR')+ln(normalden((\$ML_y2-`bx2')/`sigmaR')) +
ln(normal((-`bx1'-x1+`thetaE3'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2))-normal((-`bx1'-x1+`thetaE2'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2)))

if \$ML_y1==2
qui replace `lnf' =
-ln(`sigmaR')+ln(normalden((\$ML_y2-`bx2')/`sigmaR')) +
ln(1-normal((-`bx1'-x1+`thetaE3'-`sigmaER'/`sigmaR'^2*(\$ML_y2-`bx2'))/sqrt(`sigmaE'^2-`sigmaER'^2/`sigmaR'^2)))
if \$ML_y1==3

end

ml model lf simultaneous_full (Oprobit: y1 = x2, noconst) ///
(Linear: y2 = x2 dy1_1 dy1_2 dy1_3) ///
(theta1: ) (theta2: ) (theta3: ) ///
(sigmaR: ) (sigmaE: ) (sigmaER: )
ml init initialC initialB initial_sigma_err1 initial_sigma_err1_err2, copy skip
*ml check
*ml search
ml maximize, difficult iter(10)
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```