Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Speed issues with Stata 8


From   Fred Wolfe <[email protected]>
To   [email protected]
Subject   Re: st: Speed issues with Stata 8
Date   Thu, 16 Jan 2003 09:41:16 -0600

Thanks, Bill, so much for the very helpful explanation.

Fred

At 09:12 AM 1/16/2003 -0600, you wrote:
Fred Wolfe <[email protected]> asked about Stata 8's speed in
two contexts:

    1.  -use- and -merge-

    2.  -graph-

In this posting, I want to address (1).


Is Stata 8 slower using datasets?
---------------------------------

Stata 8 is just as fast saving and using datasets as Stata 7.  Fred, however,
observed that Stata 8 appears to be 10 times slower!  Using Stata 8, Fred
needs to -use- his old Stata 7 datasets and then -save- them again:

        . use <whatever>
        . save, replace

That will convert Fred's datasets into Stata 8 format, and thereafter, the
quickness Fred expects will return.


Why resaving datasets speeds up -use-
-------------------------------------

Stata 8 allows 26 new missing-value codes with the result that, internally,
Stata stores missing values differently.  When you -use- (or -merge- or
-append-) an old-format dataset, Stata not only loads the dataset, Stata
converts it as well.

As an example, I created a 130MB dataset containing 200,000 observations
on 200 variables using Stata 7.  Using Stata 7,

        time to -save-              0.95 seconds
        time to -use-               1.32 seconds

Then I fired up Stata 8 and used this Stata-7 format dataset:

        time to -use-               8.92 seconds

Still in Stata 8, I resaved the data and tried -use- again:

        time to -save, replace-     0.95 seconds
        time to -use-               1.25 seconds

In this example, the time to convert is substantial, being 8.92 - 1.25 = 7.67
seconds.  That same overhead will appear in -merge- and -append- if I
leave the dataset in Stata-7 format.

In smaller datasets, the conversion time is hardly noticable.

It is convenient that Stata can work with old datasets without you needing to
convert them into modern format, but understand that Stata is converting your
datasets on the fly each and every time you work with them.  With large
datasets, I recommend converting the datasets only once.

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
---------------------------------
Fred Wolfe ([email protected])
National Data Bank for Rheumatic Diseases
Arthritis Research Center Foundation
Wichita, KS USA

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index