Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: MP running no faster than IC

From   Stas Kolenikov <>
To   "" <>
Subject   Re: st: MP running no faster than IC
Date   Mon, 8 Jul 2013 22:38:41 -0500

auto.dta is too small a dataset, and regress is too simple a task for
the MP to express itself. The overhead of spreading the code into
several processors may actually be what slows you down. Simulate
yourself a n=100,000 data set and run -gllamm- on it, you'll see the
difference. Hypothetically, -bootstrap- should be very easy to run in
parallel, but Stata Corp's approach here is that the ado code (-which
bootstrap- is) does not know about multiple processors, and neither
does Mata code -- only the underlying C code is allowed to utilize the

-- Stas Kolenikov, PhD, PStat (SSC)
-- Senior Survey Statistician, Abt SRBI
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer

On Mon, Jul 8, 2013 at 8:41 PM, Ted Player <> wrote:
> Short version:  Stata MP 12-core isn't running my code any faster than
> it did when I used Stata IC, and I can't figure out why.
> Detailed version:  I am running Windows 7 Pro SP1 64-bit on a
> quad-core machine.  I have purchased two Stata licenses.  I purchased
> Intercooled when version 12 was released.  I recently purchased MP-12
> core to make my Stata code run faster.  (I realize I only have four
> cores so the 12 core is overkill; I want the flexibility to use
> Amazon's EC2, so I purchased the 12 core version.)  Both flavors of
> Stata are version 12.
> Unfortunately, I am finding that MP does not run any faster than IC.
> Indeed, in all my tests MP is a little slower.  To document the issue,
> I did a fresh install of Stata Intercooled and then I ran
> (below).  I ran it three times, and the average run time was 18.4
> seconds.  Then I uninstalled Stata completely, installed Stata MP-12
> core, and ran again.   I ran it three times, and the
> average was 19.6 seconds.  I'm disappointed that MP isn't running
> faster.
> The benchmark program shown below performs a bootstrap of regression.
> According to the Stata/MP Performance Report
> (, replication-based commands
> such as bootstrap were not benchmarked for the report because "these
> commands run another target command repeatedly, and to the extent the
> target command's performance is improved for a particular problem
> size, a similar improvement will be obtained when it is run
> repeatedly" (p. 7).  In the benchmark program below, the target
> command is regression (which the report shows to be markedly improved
> for MP).  The part of the Stata/MP Performance Report I have quoted
> here seems to suggest I should expect a performance improvement in my
> setup when using bootstrap.
> Stata makes positively *glowing* claims about MP (e.g.,
>, but I have yet to find any
> improvement whatsoever.
> I have done a creturn list to verify that I have Stata/MP installed
> correctly.  The relevant parts of the creturn list are show below:
>                   c(MP) = 1
>           c(processors) = 4                    (Stata/MP, set processors)
>       c(processors_lic) = 12
>      c(processors_mach) = 4
>       c(processors_max) = 4
>                   c(os) = "Windows"
>                c(osdtl) = "64-bit"
>         c(machine_type) = "PC (64-bit x86-64)"
> I should mention that when I look at the CPU usage with Windows Task
> Manager, it stays at 25% while is running MP-12 core.
> Also, I should mention that under the MP-12 core install, I have tried
> "set processors 1", and I get practically the same performance that I
> get from "set processors 4".  It seems to me that MP isn't using the
> extra cores.
> Can anyone explain to me why I'm not getting any better performance
> from MP-12 core than I'm getting from IC?
> ----------------------------------------------------------------------------------------------
> clear all
> sysuse auto
> timer on 1
> bootstrap, nodots reps(5000) seed(1): regress mpg weight gear foreign
> timer off 1
> quietly timer list
> local elapsed = r(t1)
> display "This benchmark process required ... `elapsed' ... seconds"
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index