Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: GMM speed


From   "Brian P. Poi" <brian@poiholdings.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: GMM speed
Date   Sat, 21 May 2011 22:09:55 -0400

On 05/21/2011 09:17 PM, Li, Wei wrote:
Dear Statalist Members,

I tried to run the following simple nonlinear GMM specification with over 2 million observations. It took more than 10 hours to do iteration 0. I had to cancel the procedure and rerun the estimation using a 1% sample of the data to test the specification. I guess I could get a workstation with more than one processors and stata/MP... But that would be expensive...

I also tried to include the derivatives in the stata statement. Doing that increased speed quite a bit using the 1% sample. It is still taking a very long time (it is almost 16 hours and I am still waiting).

Any suggestions?


gmm (y-{a=0.4}*L.y-ln({b=0.1}*x1+{c=4}*x2+x3)+{a}*ln({a}*L.x1+{b}*L.x2+L.x3)-{c=0}),xtinstruments(y x1 x2 x3, lags(2/4)) instruments(L.x1) winitial(xt D) wmat(hac nw 2) vce(unadjusted)


Is the nonlinear specification essential? Could you instead fit a linear model and interpret it as a first-order series expansion of your nonlinear model? If that is the case, then you could use -xtabond-, -xtdpd-, -xtdpdsys-, or David Roodman's -xtabond2- (available on SSC and described in volume 9, issue 1 of the Stata Journal)?

Those estimators can build up the entire instrument matrix all at once, and since they are linear estimators, are much quicker.

-gmm- provides much more flexibility, but that comes at a cost. -gmm- builds up the panel-style instrument matrix for each panel individually when computing the GMM criterion function, and it does not save those matrices from one function evaluation to the next because with large datasets with many panels and many time periods, the storage requirements could be enormous. Moreover, since -gmm- uses nonlinear optimization methods, it must evaluate the function many times. As a result, -gmm- with panel-style instruments and 2,000,000 observations will be slow.

   -- Brian Poi
   -- brian@poiholdings.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index