Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Efficient parallel computing in Stata/MP


From   William Buchanan <[email protected]>
To   [email protected]
Subject   Re: st: RE: Efficient parallel computing in Stata/MP
Date   Fri, 27 Sep 2013 10:22:09 -0500

It may depend on the definition of "large", but I've noticed non-trivial differences in the amount of time it takes to read a file into memory between Stata 12.1 MP and Stata 13 SE.  The file was .CSV and had roughly 27,500 observations, but the time difference was immediately noticeable without needing to check the internal clock times.  
 
Billy

On Sep 26, 2013, at 9:21 PM, Timothy Mak <[email protected]> wrote:

> It should be borne in mind that the speed increase in using MP (as opposed to SE or IC) is very often only significant if you're analysing a large dataset, especially for something as simple as -regress-. Opening multiple instances of Stata, however, does not have that limitation, as long as your computer has multiple cores. Also, I'm unsure whether you need 16 cores to fully utilize the MP power if you were to run 4 instances of Stata with MP4 capabilities. Maybe Statacorp would be able to comment on that more. See also -parallel- from SSC for a possible shortcut in parallelizing your work. 
> 
> Tim
> 
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Demian Panigo
> Sent: 27 September 2013 01:02
> To: [email protected]
> Subject: st: Efficient parallel computing in Stata/MP
> 
> Dear Statalist members: I need some help, because I'm not sure about
> some Stata/MP properties for parallel computing.
> We know from http://www.stata.com/statamp/statamp.pdf that many
> estimation commands (e.g. regress) are almost fully parallelizable and
> that average efficiency for all commands is around 72%. So, in
> standard linear regression problems (e.g running one million equations
> for parameter stability analysis), using Stata/MP in a multiple-core
> CPU would be an optimal time saving strategy.
> However, it is also possible to exploit the multi-core CPU environment
> by working with multiple parallel Stata/MP instances (e.g. using 4
> parallel Stata/MP instances to run 250.000 linear regressions with
> each core).
> My question is simple.... Can I save some time by using this "dual
> parallelism" methodology? (because parallel computing is
> authomatically used by Stata/MP to parallelize internal tasks of, for
> example, regress; and because I also parallelize the whole set of
> regressions between 4 cores, by means of multiple Stata/MP instances).
> Thanks in advance
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index