[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: new, fast version of xtabond2
Almost exactly two years ago, I announced the release of xtabond2 on this list (http://www.stata.com/statalist/archive/2003-11/msg00756.html). I noted several advantages over xtabond and one big disadvantage: it was slow.
There is now a new version of xtabond2 that takes advantage of the fast Mata programming language in Stata 9. It actually requires Stata 9.1 to run fast. If you have 9.0, you can upgrade for free by typing "update executable" in Stata. If Mata is not available (if you are running Stata 7 or 8), xtabond2 runs the old way.
To install, type "ssc install xtabond2, replace" in Stata. The package includes two new auxiliary files. "abest.do" reproduces some results in Arellano and Bond (1991), and complements the two other demo files, "bbest.do" and "greene.do". "xtabond2.mata" contains the new source code. It is not needed to run the program, but is included in the spirit of open software.
If I do:
xtabond2 n L.n L(0/1).(w k) yr1978-yr1984, gmm(L.(w k n)) iv(yr1978-yr1984, eq(level)) h(2) robust twostep small nomata
I get a 7-fold speed improvement. Not as fast as DPD for Ox, but a lot better. (This corresponds to a Blundell-Bond (1998) result-see bbest.do.)
There is not much change in functionality. One thing I did change is that if you don't use the -small- option, a Wald chi2 test is performed of overall model fit instead of an F test. Before, I was copying ivreg2 in always doing the F test, but I decided to copy Stata's built-in -test- command instead. Now if you do "xtabond2 y x1 x2...xN, ...", you should always get the same model-fit results as you do if you then run "test x1 x2...xN". One other change is that "gmm(x, lag(a b) eq(level))" now generates lags a...b instead of a-1...b-1.
Also, there is a new option, -nomata-, which prevents xtabond2 from using the fast Mata code even if it is running Stata 9 and forcing it instead to run the old ado code. In the early days of this new program, when it most likely to have bugs, a run with -nomata- provides a useful check on results.
If you get an error message in *red* this probably indicates an error on my part. More generally, please send data and commands that demonstrate what appear to be bugs. I will continually update the public version as necessary.
I have found two sources of potential discrepancy when the "nomata" option is turned on and off. First, if collinear or near-collinear variables or instruments are generated, the two versions of the program may differ in which are dropped, and even how many to drop, since the two versions use different routines and tolerances for determining collinearity. Second, the Mata version can get confused by an expression like "gmm(L.x, lag(-1 -1))". In principle, this is the same as gmm(x, lag(0 0)). But the Mata version first lags x, losing the latest observations of x (for t=T), then unlags the remaining information. The ado version does not lose data in this way. This kind of construct is strange, so ordinarily, it should not be a problem.
Thanks to Tue Gorgens for testing.
Center for Global Development
* For searches and help try: