Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Stata MP
"Katherine Lee" <firstname.lastname@example.org>
st: Stata MP
Wed, 5 Oct 2011 11:06:11 +1100
I have a large simulation study which I want to run but it is taking
weeks to run on my standard 4 year old 2-core PC (on 2-core version of
Stata/MP). I have been trialling a 48-core version of Stata/MP on a new
48-core Linux machine to try and improve the speed but it does not seem
to be giving me the speed I would have hoped.
In order to explore the improvements in speed I have compared the time
it takes to run 3 different do-files on the two machines with the two
different versions of Stata/MP:
1. a single regression model including lots of interaction terms on a
large dataset (58 mins on the 2-core PC vs 18 mins on the 48-core Linux
2. a simulation study which involves taking a complete dataset,
setting 50% of the data for one variable to missing and multiply
imputing the missing values (using mi impute mvn) before carrying out a
simple linear regression (3 days 17 hours on the 2-core PC vs 1 day 10
hours on the 48-core Linux machine)
3. another simulation study where
1000 observations are generated and a Poisson regression is run
(approximately 6 hours on the 2-core PC vs 2 hours on the 48-core Linux
In all 3 of these examples, the do-file runs approximately 3 times
faster on the 48-core license on the Linux machine compared to the PC,
despite the fact that the information on Stata/MP performance
improvements suggests that there should be much more substantial gains
with the "regress" command used in example 1 (where speed is supposed to
be 16.5 times faster with 16 cores compared to a using only a single
core) than with the "mi impute mvn" command which takes up the majority
of the running time in example 2 (where the speed is only reported to be
1.1 times faster with 16 cores compared to using only a single core).
In trying to explore this further I had a look at the system usage when
running the simple regression in example 1 and found that the system
started by using just a single core for the first 3-4 minutes of running
the do-file, this then jumped to=20using over 30 cores for the next 1-2
minutes before slowly dropping down to around 5 cores.
Does anyone have any similar experience with the multi-core version of
Stata or any suggestions as to why I am not seeing a much greater
improvement with Stata/MP particularly when running the regression model
in example 1?
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
* For searches and help try: