st: ml program in which likelihood function depends on value of independent var

 From Malgosia Madajewicz To statalist@hsphsun2.harvard.edu Subject st: ml program in which likelihood function depends on value of independent var Date Tue, 15 Mar 2005 13:28:41 -0500

Hi,

I'm trying to do maximum likelihood estimation in Stata and am having a problem. I believe that I can use the lf method. However, my model is different from anything discussed in the Gould, Pitblado, Sribney book, because it is actually a two-stage model and what the second-stage likelihood function looks like depends on the value of the dependent variable from the first stage. I think there are two different ways to go about implementing this, but there is a step I don't know how to do in each one.

I write the model below. I don't describe the full underlying structure, the correlations between the error terms, etc, because that would take too long. Please assume that the way I'm solving the endogeneity problems is correct, since I am pretty sure that it is. Take this just as a question about how to implement what I want to do.

I write the model assuming that I am going to implement it by first doing the first stage probit regression using the probit command. Then, I save the predicted values from this stage as a variable in the data. I write an ml routine only for the second stage. Then the model is the following:

1st stage: probit regression delta = z * gamma + error
delta is binary, z is the vector of independent vars, gamma is the coefficient vector

Let deltahat denote the predicted values from the first stage which are the linear combination (xb option) not predicted probabilities.

2nd stage likelihood function:

lnf = delta * ln[1 – norm(xb/sigma)] + (1- delta) * ln[1 – norm(x'b/sigma)] if \$ML_y1 == 0
lnf = delta * {ln[normden((\$ML_y1, xb, sigma)] – ln sigma} + (1- delta) * {ln[normden((\$ML_y1, x'b, sigma)] – ln sigma} if \$ML_y1 > 0

In this likelihood x is a vector of independent variables, one of which is normden(deltahat)/norm(deltahat). x' is a vector of independent variables, one of which is normden(deltahat)/(1 - norm(deltahat)).

Two elements in this model are different from what I have seen. First, what the likelihood function looks like for each observation depends on the value of delta for that observation. Delta is not a dependent variable, it is just one of the vars in the data set. Also, the vector of independent variables differs in one position depending on whether delta is 1 or 0. Could someone tell me how to handle these issues in the ml program?

Alternatively, I could imagine writing an ml program which estimates both equations. Then delta is just the dependent variable for the first equation. However that is a ml program which estimates two different max likelihoods and uses the result of the first for the second. I have no idea how to do this.

My eternal gratitude for any help.

Malgosia

Assistant Professor of Economics and International Affairs
Columbia University
SIPA
IAB 1304, MC 3323
420 West 118th Street
New York, N.Y. 10027

tel: (212) 854-4311
fax: (212) 854-5765

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/