Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Speed problems using Stata 8.2


From   [email protected] (Alan Riley)
To   [email protected]
Subject   Re: st: Speed problems using Stata 8.2
Date   Fri, 23 Jul 2004 10:42:28 -0500

Lars Eric Kroll ([email protected]) asked about Stata's performance:
> We're going to increase our usage of stata for data analysis at our
> institute.  Doing this we encountered some speed or performance
> problems with stata.  It would be nice if someone is able to solve
> them because FAQ stata.com isn't ;-)
> 
> The main question: What is an optimal but non-workstation hardware/
> system configuration under Win NT4 SP6 and Win XP Prof. Is stata or
> the OP-system for stata tuneable (except the memory commands in
> stata)?

Lars then listed some examples of the commands being used at his site
along with some timing results for running those commands on several
different OS and processor combinations.  To his surprise, there
was not much of a difference in the timings between the different
machines.

This is as surprising to me as it is to Lars, particularly since one
of the machines Lars listed had a processor clock speed less than
1/3rd that of the others.

At this point I am not even sure what to guess.  We really need to
see Lars' data and do-files so that we can try to reproduce his
results.  The only thing I can think of that would take nearly
the same time on a 0.8 GHz P3 and a 3.0 GHz P4 would be something
that was incredibly dependent on hard disk I/O.

Something else to think about is that if Stata is producing a lot of
output in the Results window, a poor video card (or routing the output
over a network via Windows Terminal Server or VNC or something
similar) can really bog Stata down.  If one computer has a higher
video color bit depth than another, or if Stata's Result window is
sized much larger on one computer, that computer would have to go
to much more work scrolling Stata's output.  When running speed
tests on different computers, any differences like this should
be eliminated if possible.  Also, running the tests -quietly- in
Stata will give you a more true picture of how fast the processor
can handle the calculations Stata is asking it to perform.

Another possible performance penalty could arise if all of Stata's
official ado-files were installed on a network drive rather than on
the local hard disk.  That could slow Stata down.

I would like Lars to send us his data (if possible) along with
the do-files he used to run his sample cases.  We have a wide
variety of hardware and operating system combinations available
here at Stata on which we could run timings.  Also, we could
take a look through his code to see if anything appeared to us
to be inefficient or possible to be coded in a way to run faster.


I do have a few comments about -set memory-, processor speed,
and the operating system.  Lars posted the following table:

  PC                      STATA 8.2   Windows 
                           set mem    (versions up to date)
  ================================================================
1 P3   0,8 Ghz,  265RAM      120m        NT 4
2 P4   2,8 GHz,  512RAM      120m        NT 4
3 P4HT 3,0 GHz,  512RAM      120m        XP Prof. 
4 P4HT 3,0 GHz,  512RAM      340m         "
5 P4HT 3,0 GHz,  512RAM      800m         "


Assuming Lars ran the identical do-files using the identical data
on each machine, then the original -set mem 120m- he used is all
that is needed for any of the machines.  In fact, allocating more
memory to Stata than is needed can cause the operating system to
start swapping memory to the hard drive needlessly, greatly slowing
down Stata.  This would have been most likely for the -set mem 800m-
listed on the last machine above.

I also see that Lars said that they will work with some datasets
around 1 GB in size.  None of the machines above are suitable for
this; they need more memory.  To work with a dataset that is 1 GB
of size, I would like to see a machine with around 1.4 or 1.5 GB
of RAM installed.

Stata does not take advantage of a HT processor in the sense that Stata
is a single-threaded application.  If you wish to run other tasks at
the same time as Stata, there will be some performance benefit to
running on a P4 with HT enabled over a 'regular' P4 since the machine
will essentially be working as though it has two processors.  However,
there will be some performance penalty for any single-threaded
application on an HT processor, as the full resources of that
processor are not really available.

With all else equal, Stata's speed on two machines should usually vary
by the same ratio as the processor speed on the two machines.  Thus,
I would expect the middle three machines listed above to all perform
roughly the same, with machines 3 and 4 perhaps suffering a little bit
of a performance hit due to HT being enabled and applications therefore
not being able to exploit the full clock speed of the processor on each
of its 'halves'.  I would expect the first machine listed to be quite
a bit slower, and I would expect that the last machine listed might
have some performance problems due to more memory being allocated to
Stata than physical RAM existed on the machine.

Windows XP also has some additional overhead in terms of memory and
CPU cycles compared to NT 4 and Windows 2000, so it can take away some
of the performance benefit of a faster processor.

I am interested in seeing Lars' data and do-files and finding out
the explanation of why Stata took approximately the same time to
run the tasks he gave it on the various OS/processor combinations
he had.  We will report back to the list any interesting observations
we have.


--Alan
([email protected])
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index