Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: set memory with Stata/MP 10.0


From   Steven Samuels <sjhsamuels@earthlink.net>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: set memory with Stata/MP 10.0
Date   Thu, 7 Feb 2008 12:13:21 -0500

I agree with Ronan. You can try to read bits of the data by using - infix-, or another input statement, with an 'in' option. If all analysis variables are categorical, then follow Ronan's advice: use - contract- with the 'zero' option to create a reduced data set from each chunk, with the count variable _freq to weight the analyses. If some variables are 'exact, you might group them (dates can be converted to years, for example). Even if you could input everything, any interesting analysis would take an eternity.

You do not say what the purpose of your study is. But with so much data, you can afford to create models on one set of observations and test them for predictive power on other sets.

Good luck!


On Feb 6, 2008, at 5:50 AM, Ronan Conroy wrote:


On 5 Feb 2008, at 16:48, Genty, Celine wrote:

How can I open a database of 14 Go (11 columns and 186,000,000 rows) with Stata/MP 10.0 ?
It may be worth remembering that sample size estimation is the art of getting the required amount of precision with the minimum data.

Just because you have that many observations, you may not need them.

The existence of only 11 variables also suggests that the number of covariate patterns in your data is much less than the number of observations so that the dataset might also usefully collapsed.

But you do have a problem in that there's no way to get 14 gigs of data into 4 gigs of memory.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index