Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Hardware environments for STATA manipulation of large datasets


From   Stas Kolenikov <[email protected]>
To   [email protected]
Subject   Re: st: Hardware environments for STATA manipulation of large datasets
Date   Tue, 13 Dec 2005 17:09:02 -0600

So what exactly do you do with your data? 6 million observations would
require 6Mb per byte variable, and 50Mb per double variable, so you
can have a regression with say 10 variables mixing binary and
continuous predictors in your memory. For anything greater than that,
you would be better off with more RAM... which is kinda obvious. The
best acadmic machine that I ever had access to, as far as I can
recall, had something like 48Gb of memory under Solaris. I'd say
figure out what is the most complex model you are going to estimate,
in terms of the number of variables; and get a machine with the RAM
that's at least twice as big as that. Or better four times as big as
that, so that your model can still double in complexity, or there are
some intermediate results, etc. If you are thinking about 40
variables, you would probably need 8Gb to be on a relatively safe
side. I don't think the OS issue is very much relevant except for
overall reliability; if you trust WinXP enough, that's up to you -- I
would probably move to Linux with a SCSI RAID, to make things move
faster, especially if you are doing a lot of data handling like
merging, subsetting, etc.

Some other specific recommendations is to keep your data as small as
possible in terms of both observations, variables, and data types
(always type -compress- before -save-).

On 12/13/05, Anirudh V. S. Ruhil <[email protected]> wrote:
> There may have been some traffic about a particular element or the other of
> my question, but nevertheless ...
>
> We are working with extremely large datafiles (hitting 6 million obs with
> only partial list of covariates attached), and our usual academic P4
> puppies with 512MB RAM, Win XP, and Stata 9 are simply unable to do a
> thing. Many of you have run into similar issues in the past and so may be
> able to tell me when you did step out to buy new hardware and software,
> what exactly were the key specs, what platform, etc. did you find useful.
> Moving off Win XP is not an issue so be bold.
>
> thanks in advance
>
> Ani
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>


--
Stas Kolenikov
http://stas.kolenikov.name

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index