[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: working with large datasets

From	"Sergiy Radyakin" <[email protected]>
To	[email protected]
Subject	Re: st: RE: working with large datasets
Date	Mon, 11 Feb 2008 17:11:48 -0500

Hello Adrian,

according to:
http://www.stata.com/products/64bitintro.html
"You are limited only by the total amount of memory on your machine."

So just install 32Gb of memory and give it a try. I would like to hear
about the results very much.

Be aware however, (in light of the recent discussions of the pointer
size) that you might still be constrained by a 32bit nature of the
observation number. (E.g. you can have about 2000mln obs in your
dataset, not enough for a census of China and India together).

Also be prepared to through in some more memory for large datasets if
they are of the "long" nature (many observations, few variables),
because Stata commands create additional temporary variables when they
run. So if you have 30Gb in 5 variables, be prepared to provide up to
60Gb of memory :)

Best regards,
   Sergiy Radyakin





On 2/11/08, Adrian de la Garza <[email protected]> wrote:
> Dear all,
>
> I think I found an answer to my first question. I can type:
>
> use var1 var2 var3 using filename.dta
>
> to extract var1, var2, and var3 from the file. I figured I can even write things like:
>
> use var1 var2 var3 using filename.dta if var1 == 1
>
> in order to limit the number of observations I work with, and hence reduce the file size I use to work with Stata.
>
> I think this looks pretty good, but let me know if you know of any other method that may be preferable. Also, let me know if you have an answer to question 2 below.
>
> Best,
> Adrian
>
>
> > From: [email protected]
> > To: [email protected]
> > Subject: working with large datasets
> > Date: Mon, 11 Feb 2008 16:31:34 -0500
> >
> >
> > Hello!
> >
> > Do you guys know how I can work with a very large dataset (30 Gb) in Stata? The dataset is in a text format, and I'd like to know:
> >
> > 1. How I can extract a few variables or delete observations that don't meet certain criteria; and
> >
> > 2. if it's possible to work with the entire dataset in Stata.
> >
> > Thank you very much in advance!
> >
> > Cheers,
> > Adrian
> > _________________________________________________________________
> > Connect and share in new ways with Windows Live.
> > http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008
>
> _________________________________________________________________
> Connect and share in new ways with Windows Live.
> http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: RE: working with large datasets
  - From: Adrian de la Garza <[email protected]>

Prev by Date: Re: st: RE: predict with if and option
Next by Date: Re: st: RE: predict with if and option
Previous by thread: st: RE: working with large datasets
Index(es):
- Date
- Thread