[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
jpitblado@stata.com (Jeff Pitblado, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: [Mata newbie] optimize() vs -ml- |

Date |
Tue, 16 Oct 2007 16:37:19 -0500 |

Antoine Terracol <terracol@univ-paris1.fr> asks about relative speed between Stata's -ml- command and Mata's -optimize()- suite (new is Stata 10): > I was wondering if there was any general rule to decide wether a given > likelihood maximisation problem will be more efficiently handled using > Mata's optimize() or Stata's -ml- command ? > > When I compare the example code given in the help file for optimize() to > fit a ml estimation of a beta density (v0 method) whith the equivalent > -ml- code (lf method), I find that optimize() runs approximately 3 to 5 > times faster than -ml-, depending on the size of the (simulated) dataset. > > However, when fitting a simple probit model (two explanatory variables), > optimize() (v0 method) runs significantly slower than -ml- (lf method). > > Is it caused by the matrix multiplication (cross()) needed to calculate > x*beta when explanatory variables are present, or is my code badly > written? > > (omitting Antoine's code) When comparing -ml- and -optimize()- on equal playing fields, -optimize()- will tend to be faster than -ml-. The transpose operator in Antoine's -optimize()- evaluator for the probit model can be expensive (it causes Mata to make a copy) but is not the culprit here. Antoine's observations to the contrary are due to the choice of evaluator types in the comparison. In fact having predictors in the probit model accentuates the timing difference between the -v0- and -lf- evaluator types. 1. Predictors: For the 'beta density' case Antoine's models do not contain predictors, but the 'probit' model contains 2 predictors. If Antoine removes the predictors from the 'probit' models, -optimize()- will finish the job faster than -ml- for the same constant-only model. 2. Evaluator types: The reason for the flip-flop of results is due to how -ml- and -optimize()- take numerical derivatives given the user specified evaluator type. -optimize()- with type -v0- evaluators must take the numerical derivative with respect to each element of the parameter vector passed to the users evaluator function (3 parameter elements, for each predictor and the intercept). -ml- with type -lf- evaluators take numerical derivatives with respect to the linear predictor in each equation (1 equation for the probit model). Here is a table listing the evaluator types of -ml- and -optimize()-; with the comparable evaluator types in the same row: Stata's -ml- Mata -optimize()- ------------------------------------------------- d0 d0 d1 (without scores) d1 d2 (without scores) d2 lf -- -- v0 d1 (with scores) v1 d2 (with scores) v2 The point here is that -ml-'s methods for handling type -lf- evaluators are not comparable to -optimize()-'s methods for type -v0-. If Antoine can derive formulas for producing the gradient for an -ml- type -d1- evaluator and the parameter level scores for an -optimize()- type -v1- evaluator, a speed comparison will show that -ml- with -d1- and -optimize()- with -v1- will be faster than -lf- and -v0-, respectively, and -optimize()- will tend to be faster than -ml-. Following my signature, I've included a modified version of Antoine's 'probit' model comparison. My -probit_v2()- function computes parameter level scores (and a Hessian matrix which is not used by -v1-) for -optimize()-, while my -myprobit_d2- program computes a gradient (and the negative Hessian which is not used by -d1-) for -ml-. A quick timing on my machine showed the following results: . timer list 1: 0.13 / 1 = 0.1280 2: 0.18 / 1 = 0.1830 Note that -optimize()- cannot have a type -lf- evaluator because there is no way to impose the linear form assumptions. User arguments are free to be any valid Mata object and -optimize()- does not have the concept of an equation. Incidentally, changing the specified evaluator types from -v1- to -v2- and -d1- to -d2- yielded the following results: . timer list 1: 0.04 / 1 = 0.0440 2: 0.08 / 1 = 0.0830 And moving the evaluator types to -v0- and -d0- (all numerical derivatives) yielded: . timer list 1: 0.18 / 1 = 0.1780 2: 0.27 / 1 = 0.2710 (Note that timings will vary from run-to-run due to the amount of activity on your computer). --Jeff jpitblado@stata.com ----------------------------------------------------------------------------- clear all set mem 20m set obs 2000 drawnorm x1 x2 eps gen cons=1 gen y=(1+x1+x2+eps>0) timer clear 1 timer clear 2 mata: void lnprobit_v2( real scalar todo, real rowvector beta, real colvector y, real matrix X, real colvector llj, real matrix g, real matrix H ) { real colvector pm real colvector xb real colvector lj real colvector dllj real colvector d2llj real scalar dim real scalar nobs nobs = rows(y) dim = cols(X) if (nobs != rows(X) | dim != cols(beta)) { _error(3200) } pm = 2*(y :!= 0) :- 1 xb = X*beta' lj = normal(pm:*xb) llj = ln(lj) if (todo == 0 | missing(llj)) return dllj = pm :* normalden(xb) :/ lj if (missing(dllj)) { llj = . return } g = dllj :* X if (todo == 1) return d2llj = dllj :* (dllj + xb) if (missing(d2llj)) { llj = . return } H = - cross(X, d2llj, X) } timer_on(1) y=x=. st_view(y, ., "y") st_view(x, ., ("x1","x2","cons")) S = optimize_init() optimize_init_evaluator(S, &lnprobit_v2()) optimize_init_evaluatortype(S, "v1") optimize_init_params(S, (0.1,0.5,0)) optimize_init_argument(S, 1, y) optimize_init_argument(S, 2, x) betahat = optimize(S) sigmahat=sqrt(diagonal(optimize_result_V_oim(S)))' tstat=betahat:/sigmahat pval=2:*(1:-normal(abs(tstat))) (betahat\sigmahat\tstat\pval)' timer_off(1) end program myprobit_d2 version 10 args todo b lnf g negH g1 tempvar xb lj mleval `xb' = `b' quietly { gen double `lj' = normal( `xb') if $ML_y1 == 1 replace `lj' = normal(-`xb') if $ML_y1 == 0 mlsum `lnf' = ln(`lj') if (`todo'==0 | `lnf' >= .) exit replace `g1' = normalden(`xb')/`lj' if $ML_y1 == 1 replace `g1' = -normalden(`xb')/`lj' if $ML_y1 == 0 mlvecsum `lnf' `g' = `g1', eq(1) if (`todo'==1 | `lnf' >= .) exit mlmatsum `lnf' `negH' = `g1'*(`g1'+`xb'), eq(1,1) } end ml model d1 myprobit_d2 (y=x1 x2) mat I= (0.1 , 0.5,0) ml init I, copy timer on 2 ml max, search(off) timer off 2 timer list ***** END: * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: [Mata newbie] optimize() vs -ml-***From:*Antoine Terracol <Antoine.Terracol@univ-paris1.fr>

- Prev by Date:
**Re: st: programming graphs in stata** - Next by Date:
**st: RE: Re: moving from Stata to Mata** - Previous by thread:
**st: [Mata newbie] optimize() vs -ml-** - Next by thread:
**Re: st: [Mata newbie] optimize() vs -ml-** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |