|From||"Erik Ø. Sørensen" <email@example.com>|
|Subject||Re: st: Limits of Stata SE|
|Date||Wed, 30 Oct 2002 10:40:39 -0500|
Of course, I only have 256M of memory, but before I invest in more memory, I wanted to see if experienced folks thought Stata would be a good tool for such large datasets, assuming my computer were had more memory.I use Stata on a number of datasets this size (and some a bit larger) and am mostly very happy with it. Sometimes, for very costly operations, I have to split the datasets and iterate on separate chunks, but it works ok. I have not used the Special Edition, but when the dataset is very large in number of observations, memory is the main limitation -- with millions of observations, you really do not want to try thousands of variables...
Most of my manipulations are pretty simple, mostly just selectingI have no experience with mmerge, but I have done a lot of merging and matching, as I said, sometimes splitting into chunks. But thinking through how you program difficult merge operations is necessary with any system.
records meeting certain criteria and saving them to a separate file. The
most challenging task, I suspect, will be using Jeroen Weesie's
wonderful mmerge program to do some matching across the files.
If I added sufficient memory (say increasing my RAM to a 1 gig, sinceDedicated database management packages mostly come with a lot of overhead, but I have no SAS experience. But I would not let the size of the datasets you have stop you from using Stata. The memory would serve you well in any case. Buy all that your computer can take.
the datasets take up about 200M each), am I likely to find Stata
satisfactory for this task? My computer has a fast Pentium 3, to the
degree that matters. Or, should I look towards a dedicated database
management package, like MySQL or, Heaven forbid, SAS (PROC SQL could do
a lot of what I want easily, but I stopped using SAS when I found mmerge
could do most of what I used PROC SQL for).