Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Large dataset . Slow Stata

From   Tung Le <>
Subject   Re: st: Large dataset . Slow Stata
Date   Thu, 10 May 2007 02:04:27 -0700 (PDT)

Tobias, I used to run dataset that is twice larger than yours, but still work well on my Stata Intercool 8.0 (around 30s to open the data, operates immediately with simple command such as generate or replace...). That is why I think the problem lies in your Ram. I suggest that at the begginning you set the memory at least at 500m and that is far enough for any operation. Hope it helps. Cheers.

----- Original Message ----
From: Svend Juul <SJ@SOCI.AU.DK>
Sent: Thursday, May 10, 2007 9:22:11 AM
Subject: Re: st: Large dataset . Slow Stata

Tobias Pfaff wrote:

I am using Intercooled Stata 9.2 on a laptop with an AMD 1.79 GHz
and 512 MB ram. So far, Stata worked well with all datasets.

Now, we are analyzing a larger dataset with 70 variables and 180,000
observations. The dta-file has 224 MB. It takes alone two minutes to
the file, not to mention the processing time of simple operations like
-drop- or -replace-. Is that normal? I have tried -compress-, which does
have any major impact on the file size.

What is a PROFESSIONAL WAY to handle such a dataset?

1.) Upgrade my computer equipment?
2.) Split up my dataset (which would be a big nuisance for the analysis,


Look at -help memory- if you haven't already done that.

tells that if your 70 variables are floats, the size of the 
data file should be approximately 50 MB. Apparently your dataset
includes some string variables. Dropping or converting them 
(see -help encode-) might be useful.

Hope this helps


Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6
DK-8000  Aarhus C, Denmark
Phone: +45 8942 6090
Home:  +45 8693 7796

*   For searches and help try:

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index