Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Stata MP

From   "Katherine Lee" <>
To   <>
Subject   st: Stata MP
Date   Wed, 5 Oct 2011 11:06:11 +1100

Dear all

I have a large simulation study which I want to run but it is taking
weeks to run on my standard 4 year old 2-core PC (on 2-core version of
Stata/MP). I have been trialling a 48-core version of Stata/MP on a new
48-core Linux machine to try and improve the speed but it does not seem
to be giving me the speed I would have hoped.

In order to explore the improvements in speed I have compared the time
it takes to run 3 different do-files on the two machines with the two
different versions of Stata/MP:

1. a single regression model including lots of interaction terms on a
large dataset (58 mins on the 2-core PC vs 18 mins on the 48-core Linux

2. a simulation study which involves taking a complete dataset,
setting 50% of the data for one variable to missing and multiply
imputing the missing values (using mi impute mvn) before carrying out a
simple linear regression (3 days 17 hours on the 2-core PC vs 1 day 10
hours on the 48-core Linux machine) 

3. another simulation study where
1000 observations are generated and a Poisson regression is run
(approximately 6 hours on the 2-core PC vs 2 hours on the 48-core Linux

In all 3 of these examples, the do-file runs approximately 3 times
faster on the 48-core license on the Linux machine compared to the PC,
despite the fact that the information on Stata/MP performance
improvements suggests that there should be much more substantial gains
with the "regress" command used in example 1 (where speed is supposed to
be 16.5 times faster with 16 cores compared to a using only a single
core) than with the "mi impute mvn" command which takes up the majority
of the running time in example 2 (where the speed is only reported to be
1.1 times faster with 16 cores compared to using only a single core).

In trying to explore this further I had a look at the system usage when
running the simple regression in example 1 and found that the system
started by using just a single core for the first 3-4 minutes of running
the do-file, this then jumped to=20using over 30 cores for the next 1-2
minutes before slowly dropping down to around 5 cores.

Does anyone have any similar experience with the multi-core version of
Stata or any suggestions as to why I am not seeing a much greater
improvement with Stata/MP particularly when running the regression model
in example 1?


This email has been scanned by the MessageLabs Email Security System.
For more information please visit 

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index