Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: working with large datasets


From   "Sergiy Radyakin" <serjradyakin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: working with large datasets
Date   Mon, 11 Feb 2008 17:11:48 -0500

Hello Adrian,

according to:
http://www.stata.com/products/64bitintro.html
"You are limited only by the total amount of memory on your machine."

So just install 32Gb of memory and give it a try. I would like to hear
about the results very much.

Be aware however, (in light of the recent discussions of the pointer
size) that you might still be constrained by a 32bit nature of the
observation number. (E.g. you can have about 2000mln obs in your
dataset, not enough for a census of China and India together).

Also be prepared to through in some more memory for large datasets if
they are of the "long" nature (many observations, few variables),
because Stata commands create additional temporary variables when they
run. So if you have 30Gb in 5 variables, be prepared to provide up to
60Gb of memory :)

Best regards,
   Sergiy Radyakin





On 2/11/08, Adrian de la Garza <kokootchke@hotmail.com> wrote:
> Dear all,
>
> I think I found an answer to my first question. I can type:
>
> use var1 var2 var3 using filename.dta
>
> to extract var1, var2, and var3 from the file. I figured I can even write things like:
>
> use var1 var2 var3 using filename.dta if var1 == 1
>
> in order to limit the number of observations I work with, and hence reduce the file size I use to work with Stata.
>
> I think this looks pretty good, but let me know if you know of any other method that may be preferable. Also, let me know if you have an answer to question 2 below.
>
> Best,
> Adrian
>
>
> > From: kokootchke@hotmail.com
> > To: statalist@hsphsun2.harvard.edu
> > Subject: working with large datasets
> > Date: Mon, 11 Feb 2008 16:31:34 -0500
> >
> >
> > Hello!
> >
> > Do you guys know how I can work with a very large dataset (30 Gb) in Stata? The dataset is in a text format, and I'd like to know:
> >
> > 1. How I can extract a few variables or delete observations that don't meet certain criteria; and
> >
> > 2. if it's possible to work with the entire dataset in Stata.
> >
> > Thank you very much in advance!
> >
> > Cheers,
> > Adrian
> > _________________________________________________________________
> > Connect and share in new ways with Windows Live.
> > http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008
>
> _________________________________________________________________
> Connect and share in new ways with Windows Live.
> http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index