Hi, Dear all,
I recently upgraded from STATA 7 to STATA 9. WHen I reran some old programs
and played around with STATA 9, I found sometimes the STATA 9 is much slower
than the STATA 7.
1. I have a dataset with about 700,000 obs. and 4 variables. When I
type -list-, it takes STATA 9 several seconds to start to list, while in
STATA 7 it begins to list instantly.
2. When I ran the following ado file on the above dataset in STATA 9 and
STATA 7, STATA 9 is always much slower. The dataset has about 700,000 obs.
There is a categary variable called 'group', which is continuous from 1 to
6250. Whith which group, there are 80-127 observations. (Different groups
may have different number of observations). For each group, I need to run a
regression and record the estimation coefficients. I use a loop to do the
job. In the loop, I avoided to use -if group=`i'- because it seems -if- cost
more time than -in- to identify the desired observations from my experience
in STATA 7 when dealing with large dataset. Basically, I first determine the
beginning obs and ending obs for each group and then run the regression in
the loop using -in- condition.
I did some experiments. If I keep 1000 groups, STATA 7 used 17 seconds to
finish while STATA 9 used 54 seconds. With 3000 groups, STATA 7 used 144
seconds while STATA 9 used 471 seconds. With all 6250 groups, STATA 7 used
about 18 minutes, while STATA 9 used about 110 minutes. All the experiments
are done on the same computer and without other program running. The results
don't make sense to me. The speed shouldn't be so slow for Verison 9. It
seems that I need to optimize my program for STATA 9. Any thoughts or
suggestions?
set more off
set mem 100m
use ./temp3, clear
sort group
by group: gen obsnum=_N
by group: keep if _n==1
keep group obsnum
sum group
local max=r(max)
forval i=1/`max' {
local n`i'=obsnum[`i']
}
use ./temp3, clear
sort group
tempname result1
postfile `result1' id alpha beta using .\rep_beta_anndate, replace
local base=0
forval i=1/`max' {
local first=`base'+1
local last=`base'+`n`i''
quietly regress ret vwretd in `first'/`last'
post `result1' (`i') (_b[_cons]) (_b[vwretd])
local base=`base'+`n`i''
}
postclose `result1'
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/