Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: two-stage mvprobit and ghk vs. sem algorithm questions

 From Andrew To statalist@hsphsun2.harvard.edu Subject st: two-stage mvprobit and ghk vs. sem algorithm questions Date Mon, 16 Jul 2012 14:35:28 -0400

```Hi Statalist,

I have two questions:

Question 1:
I have been trying to confirm if the following two-stage mvprobit
analysis is valid and would appreciate any thoughts/comments.  I have
a 3 equations of interest that I believe have correlated errors:

W1 = aA + dW2 + e1
X1 = bB + eX2 + e2
Y1  = cC + fY2 + e3

where
W, X, Y are dichotomous variables
A,B,C are exogenous variables
a,b,c are exogenous variable coefficients
W', X', Y'  are endogenous dichotomous variables
d, e, f are endogenous variable coefficients
e1, e2, e3 are errors that are jointly normally distributed

Each of these equations is itself part of a two equation system of the
type described by Mallar (1977) and Maddala (1983, pg 246)  such that:
W1 = aA + dW2 + e1
W2 = a'A' + d'W1 + u1
with analogous equations defined for X and X', and Y and Y'.

Mallar and Maddala solve this smaller system by estimating the reduced
form equations for each of these two, obtaining fitted values, and
then running further ml probits to obtain estimates of d/sigma1 and
a/sigma1.

My hope is that I can estimate the reduced form for the endogenous
variables (W2, X2, Y2):
W2 = a'A' + aA + v1
obtain their predicted estimates (W2*, X2*, Y2*) and then use those
in the original system to allow for the correlated errors among the
W1,X1,Y1 equations using the mvprobit command.

mvprobit (W1 = A W2*) (X1 = B X2*) (Y1 = C Y2*)

This is based on the idea that the you could estimate the coefficients
in each of the smaller systems by performing the 2 stage least squares
to obtain consistent results but that then performing the mvprobit we
are obtaining more efficient estimates that take into account the
error correlations. This is analogous to estimating OLS equations one
by one or by SUR.

Question 2:
The mvprobit command uses the GHK simulator.  My understanding is that
the GHK simulator is computationally efficient for systems of 4 or 5
equations but that for larger systems a stochastic EM algorithm is
likely to be a better option.  Is this correct?