Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Upgrading from v9 to v10mp


From   vwiggins@stata.com (Vince Wiggins, StataCorp)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Upgrading from v9 to v10mp
Date   Mon, 02 Jul 2007 14:47:03 -0500

Simon Moore <simoncmoore@gmail.com> asks about performance increases in
Stata/MP over Stata/SE.

> There's several reasons why I'd like to upgrade from Stata 9 to
> Stata 10.  But I have a question about MP and whether this is
> something I should be looking at (as I have dual core CPU). [...]

For almost all users, Stata/MP will run substantially faster than Stata/SE, if
not for all analyses, then for most.

When developing Stata/MP we focused on commands that take the longest to run
so as to get the most benefit from parallelization.  Because estimators
typically take the longest time to run, they are a good group of commands to
compare.  Concerning a dual-core computer (or any two-processor computer) the
median runtime among all estimators under Stata/MP is 60% of the runtime
required under Stata/SE.  This represents a very high-degree of
parallelization because a 50% runtime is the best that can be hoped for, that
is to say it is the theoretical limit.

Half of all estimators perform even better than 60% runtime, meaning they
approach the theoretical limit.  One-quarter of the estimators, however, run
in 75% of the time or more, a lesser performance increase, and a few
estimators are not parallelized as all.  Most of the latter are time-series
estimators.  Many time series computations are recursive, meaning that you
must know the answer to the prior period before you can compute the current
period.  Such recursive computations are inherently unparallelizeable.

The commands that are important to you, might not be those important to
others.  To find out about the performance improvement of every command in
Stata that takes any time to run, see the Stata/MP white paper --
http://www.stata.com/statamp/report.pdf.  Do note that the white has not been
updated to Stata 10 and does not reflect the parallelization of survey
commands.

Simon goes on to ask about about some specific commands, one a user-written
command.

> [...] Probably the most intensive work involves such things as
> xtlogit and I've also looked at reoprob - with a large survey and
> bootstrapping would MP work through user written commands (i.e.
> reoprob) any faster?  Other things that can take time include
> cycling through large -foreach num- lists and using postfile .

Let's consider each of Simon's heavily-used, time-consuming commands.

As Austin Nichols <austinnichols@gmail.com> noted, there are really two
estimators in -xtlogit-, -xtlogit, fe- for fixed-effects models and -xtlogit,
re- for random effects models.  Both are highly parallelized, with -xtlogit,
fe- running in 65% of the time and -xtlogit, re- in just under 50% of the
time.  As Austin notes, the random-effect estimator's better than
theoretically possible performance, includes a cache effect that will vary
across computers.  Regardless, -xtlogit, re- runs much faster under Stata/MP.

Since he is interested in -xtlogit-, I assume that Simon works with
panel/longitudinal data.  He may then use many -by:  generate- or -egen ...,
by()- commands.  Because of their panel nature, such expressions are almost
100% parallelized and will run in about half the time under Stata/MP.

I ran a quick test of the user-written maximum likelihood estimator -reoprob-.
On a large dataset, Stata MP estimated the model in 68% of the time of
Stata/SE.  This is fast, but I was surprised it wasn't even faster because in
most experiments I have run with user-written estimators the results have been
closer to 60%.  Why?  User-written commands are not themselves parallelized,
but the commands used to implement them are.  For instance, all of the
components of -ml- are parallelized.  I am guessing that in the case of
-reprob-, the quadrature computation was not as conducive to this kind of
parallelization.

Neither -foreach-, nor -postfile- are parallelized.  The loops they implement,
however, will run faster to the extent that the commands in the loop are
themselves parallelized.

The same thing is true for -bootstrap- and all other replication methods.
They run faster to the extent that the command being bootstrapped, jackknifed,
etc., is parallelized.  This is discussed on page 7 of the Stata/MP white
paper in Section 7, Performance Summary.

As David Airey <david.airey@Vanderbilt.Edu> mentioned, all survey computations
(think -svy:- prefix) have been parallelized in Stata 10.  This means that the
survey portion of the computation runs in about half the time.  For commands
like -regress- and -streg- this means that they are nearly 100% parallelized,
even with survey data.  (Note the subtle plug that survival estimators now
support complex survey data.)

With the release of Stata 10, another common question has been, "can I run
Stata/MP on a single-core, single-processor computer?"  Yes.  There is a small
amount of additional overhead that is unnecessary when running in a single-cpu
environment, but in most cases it was unmeasurable.  By the way, you can also
run Stata/SE under your Stata/MP license if you are concerned about this small
overhead.


-- Vince
   vwiggins@stata.com

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index