Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: do variables not used in a process take up memory while a process runs?


From   Daniel Feenberg <feenberg@nber.org>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: do variables not used in a process take up memory while a process runs?
Date   Wed, 4 May 2011 11:14:24 -0400 (EDT)


On Wed, 4 May 2011, Doug Hess wrote:

I'm running models with 30 predictors on 150,00 records using
xtmelogit to examine random intercepts. As you can imagine this takes
a long time to run. I started a model last Thursday night (US east
coast)  and it didn't produce results until Queen Elizabeth II stepped
into Westminster Abbey Friday morning (perhaps propitious for my
research depending on your belief in the divine right of monarchs, but
it was nine hours after Stata started the process).

So, I'm looking for any tricks to speed things up (using Stata 11/IC
on a Windows 7 PC with 2.66 ghz Intel and 2.96gb RAM usable). I tried
the Laplacian option but it didn't seem to speed things up and I'm not
sure if the estimates are considered reliable, so to speak, if you use
that option.

One question: I first -keep- only the variables in the model, does
this speed things up? I.e., is Stata turning around in its head all
the data in the database, or just those that are in the model as it
runs the process?

Probably not. Stata keeps all variables in core, and the speed to load a value doesn't really depend on the amount of core in use, unless you are starting to page, which would manifest itself as disk noise and very slow operation even before you started the estimation. If you were running in SAS, this would be a potentially huge win.


Second question: if I use a USB memory stick to "readyboost" the
memory, does this help speed Stata up for such processes?


Again, stata and the data are in core, so readyboost won't have a
significant effect. It may let the process start up a little faster,
but have no effect on the time per calculation.

I'm open to other thoughts. Or I am I better off dumping the results
into other specialized software for hierarchical modeling? (No offense
to Stata.)


statamp, perhaps? (I don't know if -xtmelogit- is affected by multiple cores, but like most econometric calculations it is "embarrassingly parallel".

Daniel Feenberg

Thank you.

Doug
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index