Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Daniel Feenberg <feenberg@nber.org> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: Efficient parallel computing in Stata/MP |
Date | Fri, 27 Sep 2013 07:43:36 -0400 (EDT) |
-----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Demian Panigo Sent: 27 September 2013 01:02 To: statalist@hsphsun2.harvard.edu Subject: st: Efficient parallel computing in Stata/MP Dear Statalist members: I need some help, because I'm not sure about some Stata/MP properties for parallel computing. We know from http://www.stata.com/statamp/statamp.pdf that many estimation commands (e.g. regress) are almost fully parallelizable and that average efficiency for all commands is around 72%. So, in standard linear regression problems (e.g running one million equations for parameter stability analysis), using Stata/MP in a multiple-core CPU would be an optimal time saving strategy. However, it is also possible to exploit the multi-core CPU environment by working with multiple parallel Stata/MP instances (e.g. using 4 parallel Stata/MP instances to run 250.000 linear regressions with each core). My question is simple.... Can I save some time by using this "dual parallelism" methodology? (because parallel computing is authomatically used by Stata/MP to parallelize internal tasks of, for example, regress; and because I also parallelize the whole set of regressions between 4 cores, by means of multiple Stata/MP instances). Thanks in advance *
In my experience, Stata/MP fully exploits as many real cores as are available, very efficiently for regression commands. If you have hypercores, running multiple Stata jobs will exploit those efficiently also. I posted the results of a simple experiment at:
http://www.nber.org/stata/efficient under heading "Stata/MP".-parallel.ado- is a very interesting routine. It will start up multiple Stata processes and let each one read a part of the dataset, then combine the results into a single dataset. For processes that are single-threaded for no good reason, or if you don't have Stata/MP, it seems like a great idea. I believe it will also work well with hyper-cores, but I have no experience with it. But for I/O it would just make things worse, since each thread has to read the entire dataset.
See http://www.stata.com/statamp/report.pdffor a more discouraging report on hyper-cores. I don't have an explanation for the difference in experiences. There is no substitute for experimentation on your actual hardware, and there would be interest on this list in your experience.
Daniel Feenberg NBER * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/