Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: maximum likelihood procedures for oprobit


From   "Sunhwa Lee" <[email protected]>
To   [email protected]
Subject   st: maximum likelihood procedures for oprobit
Date   Wed, 7 Dec 2005 18:29:25 -0800 (PST)

Hello,
I am writing an ml program that replicates �oprobit� code in stata. With 
all many trials, I have not succeeded in getting the exactly identical 
estimates as with "oprobit" command. Below is the basic program I used to 
compare with oprobit estimates. 

**************************************
capture program drop myoprobit
    program define myoprobit
    args lnf xb  t1 t2 t3
    tempvar p1 p2 p3 p4
    qui gen double `p1'=ln(norm(`t1'-`xb'))
    qui gen double `p2'=ln(norm(`t2'-`xb')-norm(`t1'-`xb'))
    qui gen double `p3'=ln(norm(`t3'-`xb')-norm(`t2'-`xb'))
    qui gen double `p4'=ln(norm(`-t3'+`xb'))
    qui replace `lnf'=($ML_y1==1)*`p1' + ($ML_y1==2)*`p2' /*
                  */ +($ML_y1==3)*`p3' + ($ML_y1==4)*`p4'
    end
clear
sysuse auto
replace rep=2 if rep==. | rep==1
replace rep=rep-1
xi: ml model lf myoprobit (rep =mpg i.turn, nocons)(tau1: ) (tau2:) 
(tau3:),
ml maximize
*******************************************

The two results look similar at the first glance, but if you take a closer 
look, _Iturn_32 and _Iturn_46 are different. The differences may be 
negligible with auto.dta, but they are amplified with my dataset, again on 
dummies.

To make sure, I made variations on the base model above in the following 
ways:

1) Using alternative distribution function: norm vs. normprob
2) Define p4, the last probability, differently: 
`p4'=ln(1-norm(`t3'-`xb')) vs. `p4'=ln(norm(-`t3'+`xb'))
3) with or without equation name: 
(rep =mpg i.turn, nocons) vs. (auto: rep =mpg i.turn, nocons)
4) with default ml tolerance and Itolerance vs. with the stata internal 
values, that is, tolerance(1e-4) and ltolerance(0) 

By combining these four alternatives, I got 16 variations in ml ordered 
probit programs. Among the coefficients, _Iturn_32 and _Iturn_46 are still 
different across the 16 variations and not to mention that none of 16 
models produces the same estimates as with "Oprobit" command. To my 
knowledge, the modifications (1), (2) and (3) should not make any 
difference in estimates. 

My questions are 

1) How to replicate oprobit results including the values for dummies.
2) Why do the three variations above result in different estimates for 
some coefficients? As far as I understand,
   - normprob calculates the same cdf for normal distribution.
   - How to define the last probability: ln(1-norm(`t3'-`xb')) = ln(norm(-
`t3'+`xb')). I have read that recommends ln(norm(-`t3'+`xb')) over ln(1-
norm(`t3'-`xb')) in the following stata list.
http://www.stata.com/statalist/archive/2003-05/msg00076.html
   - Including an equation name should not affect the estimation.
3) With my dataset, playing around with tolerance/itolerance levels made a 
big difference in some dummy estimates. Then, what is the best way to fix 
up these tolerance levels?
I'd be very grateful for any suggestions.

Sunhwa
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index