[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Limits of Stata SE

From	"Erik �. S�rensen" <[email protected]>
To	[email protected]
Subject	Re: st: Limits of Stata SE
Date	Wed, 30 Oct 2002 10:40:39 -0500

On onsdag, okt 30, 2002, at 10:09 America/Montreal, Hoetker, Glenn wrote:

Of course, I only have 256M of memory, but before I invest in more memory, I wanted to see if experienced folks thought Stata would be a good tool for such large datasets, assuming my computer were had more memory.

I use Stata on a number of datasets this size (and some a bit larger) and am mostly very happy with it. Sometimes, for very costly operations, I have to split the datasets and iterate on separate chunks, but it works ok. I have not used the Special Edition, but when the dataset is very large in number of observations, memory is the main limitation -- with millions of observations, you really do not want to try thousands of variables...

Most of my manipulations are pretty simple, mostly just selecting
records meeting certain criteria and saving them to a separate file. The
most challenging task, I suspect, will be using Jeroen Weesie's
wonderful mmerge program to do some matching across the files.

I have no experience with mmerge, but I have done a lot of merging and matching, as I said, sometimes splitting into chunks. But thinking through how you program difficult merge operations is necessary with any system.

If I added sufficient memory (say increasing my RAM to a 1 gig, since
the datasets take up about 200M each), am I likely to find Stata
satisfactory for this task? My computer has a fast Pentium 3, to the
degree that matters. Or, should I look towards a dedicated database
management package, like MySQL or, Heaven forbid, SAS (PROC SQL could do
a lot of what I want easily, but I stopped using SAS when I found mmerge
could do most of what I used PROC SQL for).

Dedicated database management packages mostly come with a lot of overhead, but I have no SAS experience. But I would not let the size of the datasets you have stop you from using Stata. The memory would serve you well in any case. Buy all that your computer can take.

best regards,
Erik

--
Erik �. S�rensen, <http://www.geocities.com/erik_oiolf/>.
phd student (economics), Norwegian School of Economics.
currently visiting Queen's University, Kingston Ontario.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

References:
- st: Limits of Stata SE
  - From: "Hoetker, Glenn" <[email protected]>

Prev by Date: Re: st: Model choice for predicting ordered non-normal categorical variable
Next by Date: Re: st: Limits of Stata SE
Previous by thread: st: Limits of Stata SE
Next by thread: Re: st: Limits of Stata SE
Index(es):
- Date
- Thread