Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: StataMP will not utilize multiple processors (Windows XP)


From   Stas Kolenikov <[email protected]>
To   [email protected]
Subject   Re: st: StataMP will not utilize multiple processors (Windows XP)
Date   Sat, 1 Aug 2009 21:24:59 -0500

I think what Austin suggests is perfectly sensible. Here's how you
could hack -bstrap- (and that's probably what Austin's package does in
its guts):

== mybs.do ==
set seed `1'
xtmelogit y x || panel:
mat li bb = e(b)
gen bspanel = panel
bootstrap using mybs-`1', reps(50) replace cluser(panel)
idcluster(bspanel): xtmelogit y x || bspanel , from( bb )
== mybs.do ==

-from(bb)- might be redundant if Stata picks up the estimation results
from the zeroth run of the -bootstrap- to provide the starting values,
but if it does not, this trick will speed things up a little bit.

Then on your machine, start 8 copies of Stata and run -do mybs 111- in
the first window, -do mybs 222- in the second window, etc. Once that
finishes, you will have 8 files that look like mybs-111, mybs-222...
mybs-888. You need to convince Stata those are the results from the
same bootstrap. Close all Statas except the first two; in the second
one, type

use mybs-111, clear
append using mybs-222
append using mybs-333
...
append using mybs-888
save mybs-111, replace

Then mybs-111 will contain all of the bootstrap results. You can close
the second Stata now. Go back to the first Stata (which still thinks
that mybs-111 is one and the only bootstrap results file) and type
-bootstrap-. It should (I highlight, SHOULD, rather than will for
sure) pick up all the results from your appended file.

Make sure that you understand every step here -- this is a hack of
what Stata does and how it does it, and this is not a very well
protected hack. Click "I accept the terms of being desperate" to
proceed...

All simulations are inherently highly parallelizable. The programming
question, I guess, is how to arrange the pipeline of storing the
estimation results so that no results are getting lost because of the
clashes when they arrive at the same time. I am sure that's doable on
Stata Corp side... but I am also sure that their priorities are in
entirely different dimension, judging by the number of new things in
the new release.

On Fri, Jul 31, 2009 at 1:55 PM, Cohen, Elan<[email protected]> wrote:
> Thank you Austin.  That was very informative.  It turns out that neither my example (bootstrap) nor my real interest (xtmelogit) utilizes multiple processors (but gllamm apparently does).
>
> I'm currently sharing the server I'm on, so my timings would not be representative.  But I'll certainly try it asap (just out of curiosity).
>
>
> Thank you everyone.
>
> - Elan
>
>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of
>> Austin Nichols
>> Sent: Friday, July 31, 2009 14:36
>> To: [email protected]
>> Subject: Re: st: StataMP will not utilize multiple processors
>> (Windows XP)
>>
>> MP only does fine-grain parallelization; -bootstrap- and -simulate- do
>> not send iterations to separate processors, but each iteration to all
>> available processors.  About a year ago, I wrote -bs2- and -sim2- to
>> split iterations across processors (just need to keep track of a list
>> of random numbers), but they did not perform all that well, so I never
>> released them.  See the table on p.17 of
>> http://www.stata.com/statamp/report.pdf  for some expected time
>> improvements for different commands.
>>
>> Can you show us some timings for your system like this:
>>
>> . drawnorm v1-v50, n(100000) clear
>> (obs 100000)
>> r; t=13.26 14:22:03
>>
>> . qui reg v*
>> r; t=1.86 14:22:10
>>
>> . set processor 1
>> number of available processor(s) changed from 2 to 1
>> r; t=0.00 14:22:15
>>
>> . qui reg v*
>> r; t=3.07 14:22:23
>>
>> . qui bs: reg v*
>> r; t=123.09 14:24:51
>>
>> . set processor 2
>> number of available processor(s) changed from 1 to 2
>> r; t=0.00 14:25:04
>>
>> . qui bs: reg v*
>> r; t=90.16 14:26:39
>>
>>
>> On Fri, Jul 31, 2009 at 11:54 AM, Cohen, Elan<[email protected]> wrote:
>> >> How do you know you are only using one processor?
>> >
>> > I'm watching the task manager as it's running.  The Stata
>> process only uses 13% of the CPU which is approximately 1/8 of 100%.
>> >
>> >> The bootstrap is not parallelized, but each iteration of the
>> >> estimation should be.
>> >
>> > This begs the question, are there some Stata commands that
>> are built to run on multiple processors and some that aren't?
>>  I would've thought, if any, bootstrap would work on multiple
>> processors.
>> >
>> >
>> >> On Fri, Jul 31, 2009 at 10:36 AM, Cohen,
>> Elan<[email protected]> wrote:
>> >> > Hi all,
>> >> >
>> >> > I'm running StataMP on a Windows server running XP.  The
>> >> computer has a Quad core processor, so 8 processors show up
>> >> in the task manager (due to hyper threading, I believe).
>> >> >
>> >> > However, if I run a command, such as bootstrap, Stata will
>> >> only allocate 1 processor to do the job.
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index