[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Sergiy Radyakin" <serjradyakin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Is a data set with size 296,242,984 too big for stata to analyze? |

Date |
Mon, 10 Nov 2008 13:40:57 -0500 |

Hello Mandy, below is a pessimistic scenario that shows that you might not be able to work with such large datasets in Stata 32. It makes extreme assumptions to show that this can be the case, but it need not be true in your situation. Suppose you have 2 variables of type byte X and Y. The width of this dataset is 2 bytes, and with the filesize you quote you have ~148 121 492obs. The actual memory consumption is 8+2 = 10bytes (in MP) per observation making it 1 481 214 920 bytes, or about 1.4 G. You've managed to load this dataset into memory, so probably you have MORE than 2 variables, or they are of wider types, or you have Stata SE. The key piece of information that is missing is the number of variables in your compressed dataset and the amount of memory free after you load the dataset. Regress will attempt to create temporary variables (at least one, to store e(sample), but may be more, the exact amount is undocumented). If there is no space to accomodate those temporary variables, you will not be able to work with this dataset. Also be specific, if by "size" you mean "size of data in bytes" as I mean it, or "size of used memory" as Martin means it. (the difference is the overhead, and by looking at the numbers reported by Martin, I can be confident that he is using an MP version of Stata :) . Best regards, Sergiy Radyakin On Sun, Nov 9, 2008 at 6:37 AM, Martin Weiss <martin.weiss1@gmx.de> wrote: > As others have said, it all depends on your OS, available memory, version of > Stata and so on. The answer to your question is: It depends... I have just > tried to drive up the size of the nlsw88 dataset by repeatedly -expand-ing > it > > ********** > sysuse nlsw88.dta, clear > expand 3800 > d > ********** > > which easily took me to "size: 298,718,000" without any complaints from > Stata. I had -set mem 1G- beforehand. -des- returned that over 75% of the > memory were still free... > > > HTH > Martin > _______________________ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re:st: Is a data set with size 296,242,984 too big for stata to analyze?***From:*"Martin Weiss" <martin.weiss1@gmx.de>

- Prev by Date:
**Re: st: -log- in Stata to bypass commands that resulted in errors?** - Next by Date:
**st: Qs-mata run-time error** - Previous by thread:
**Re:st: Is a data set with size 296,242,984 too big for stata to analyze?** - Next by thread:
**st: Interval variables as independent variables** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |