Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: -cmp- adds multinomial probit


From   "David Roodman (DRoodman@cgdev.org)" <DRoodman@CGDEV.ORG>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: -cmp- adds multinomial probit
Date   Mon, 7 Jul 2008 09:09:37 -0400

I have made some significant changes to -cmp-.

A short-hand way of describing its current status is to list the
commands it can emulate to various degrees: probit, ivprobit, biprobit,
oprobit, mprobit, asmprobit, tobit, ivtobit, cnreg, heckman, heckprob,
sureg, triprobit, mvprobit, bitobit, mvtobit, bioprobit. ...though its
purpose is not to emulate but to allow estimation of a broader variety
of models

For those unfamiliar with -cmp-: At its core, it is a flexible SUR
maximum-likelihood-based estimator that offers a range of models based
on the normal distribution, which can be mixed and matched (continuous
(OLS-like), probit, tobit, ordered probit, multinomial probit). But it
is also consistent for non-SUR models with recursive sets of equations,
i.e., ones in which some LHS variables belong in other LHS variables'
equations, provided that the equations can be arranged so that the
matrix of the LHS variables' coefficients in each others' equations is
triangular. As David Drukker helped me appreciate, and as the help file
now explains, this estimation framework applies to two broad cases: 1)
the true data-generating process is recursive, in which case -cmp- is a
FIML estimator; 2) the DGP is not recursive, but instruments make it
possible to construct a recursive system, just as in 2SLS, that allows
estimation of the structural parameters in the final stage--making -cmp-
a LIML estimator. (The last stage may contain more than one equation).

The recent changes are:
1) Addition of an (alternative-specific) multinomial probit equation
type. This can be used with two different syntaxes: one analogous to
-mprobit-; one more like -asmprobit-, in which the user lists a separate
equation for each outcome. The help file explains more and includes
examples. I am not yet satisfied with the reliability of convergence for
the multinomial probit equations models with no restrictions on the
covariance matrix. I may need to parameterize this matrix with the
Cholesky decomposition instead of or in addition to the lnsig/atanhrho
parameterization.

2) Addition of an "lf" mode. -ml-, on which -cmp- is based, accommodates
four types of likelihood calculation routine. Till now, -cmp- has been
purely a "d1" routine, which calculates likelihood gradients (scores) at
each iteration analytically. But "lf" is *sometimes* faster, more
precise, or more reliably convergent, as the help file discusses. Most
of my -cmp- command lines included either "lf" or "tech(dfp)" (the
latter defaulting to d1), but not both. (For problems requiring the GHK
algorithm, d1 is almost always better.)

3) Switching to use of ghk2() a new Mata implementation of the GHK
algorithm for simulating higher-dimensional cumulative normal
probabilities. See
http://www.stata.com/statalist/archive/2008-06/msg00323.html. I found
this was necessary for -cmp- to work well on multi-equation probit
models with many observations and few draws. (Cappellari and Jenkins,
Stata Journal 3(3), find that you can get reasonable estimates with as
few as 5 draws per observation when the number of observations is high.)
ghk2() also speeds up computation for multi-equation ordered probit
models.

-cmp- now requires -ghk2-. The installation commands are "ssc install
cmp, replace" and "ssc install ghk2, replace". Restart Stata after
installing.

As always, comments welcome.
--David

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index