[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Jitian Sheu" <jtsheu@mail.cgu.edu.tw> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: RE: st: stata 10 mp2 vs stata9 |

Date |
Sun, 26 Aug 2007 02:00:55 +0800 |

The Stata 9 is just a one-process STata 9.0SE -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Vince Wiggins, StataCorp Sent: Saturday, August 25, 2007 11:21 PM To: statalist@hsphsun2.harvard.edu Subject: Re: RE: st: stata 10 mp2 vs stata9 Jitian Sheu <jtsheu@mail.cgu.edu.tw> sends more information about his timings between Stata 9 and Stata 10/MP. His "program" is two -generate- statements with a 22 string comparisons. > . gen byte plavix_tmp=(order_code=="B022932100") > . gen byte ticlo_tmp=(order_code=="A031596100" | > order_code=="A033091100" | > order_code=="A028034100" | > order_code=="A033675100" | > [...] > order_code=="B018857100" ) And, he is running it on a 1.75 GB dataset. Jitian has now run his timings using different amounts of memory allocated to Stata. His timings comparing 64-bit Stata 9 and Stata 64-bit Stata 10/MP are: Runtime ------------------ Memory Stata 9 Stata 10 ------ ------- -------- 2.2 66.454 N/A 2.4 69.063 57.969 2.6 73.094 49.719 2.8 77.844 54.735 3.0 81.500 58.781 In the email that started this thread, Jitian said, "Stata 10 is actually 1.5 times slower that Stata 9". I have checked Jitian's latest email 3 times now, and I admit that perhaps I am a little slow on this Saturday morning, but it looks to me like all of the Stata 10 runtimes are faster than any of the Stata 9 runtimes. Stata 10/MP looks to be between 14% and 33% faster than Stata 9 on Jitian's program. I am guessing that Jitian is comparing Stata 10/MP2 to Stata 10/SE, though he has never said whether his Stata 9 is MP or not. Others at StataCorp have been guessing that he is running MP for both Stata 9 and Stata 10. These timings suggest to me that Jitian is indeed running Stata 9/SE and Stata 10/MP2 and that the improved speed under Stata 10 is because Stata 10 is using both of hist machine's processors/cores when executing the -generate- statements. Jitian also wonders why Stata 10 needs more memory than Stata 9 for the same problem. Recall that Stata 10 has a new file format, and that is largely to support the new time-and-date format variables. This new storage format requires slightly more space to store Jitian's data in memory. As Jitian notes, the runtimes are dependent on the amount of memory allocated to Stata. This does not surprise me as much as it does Jitian. Jitian's program spends most of its time running through the dataset, because his commands will not take long to run. This means that the computer will be spending most of its time moving data from pretty fast normal memory to incredibly fast cache memory that is directly on the processor chip. As we change the amount of memory allocated to Stata, Jitian's data is distributed differently in memory and the caching process may be more or less efficient. This also explains why Stata 10/MP is only 14% to 33% faster, when normally I would expect 50-100% faster. Because Jitian's computations are fairly fast, particularly the first -generate- statement, most of the computer's time is spent pulling data from regular memory to cache memory, rather than processing the commands. If I'm wrong about Jitian using MP Stata 10 and SE under Stata 9, then the cache effect could still explain the speed differences. Though if this is the case, it is curious that Stata 10 is so much faster than Stata 9. With modern computers, timings can be heavily influenced by these cache effects. Computationally intense process, where heavy-duty computations are done on each record are less affected by such caching, but Jitian's generate statements are not really heavy-duty. Though it makes some timing comparisons difficult, you do not want chip makers to stop this trend toward large on-chip caches. Much of computer's speed increases in recent years can be attributed to improved use of cache and not to higher processor clock speeds. -- Vince vwiggins@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: RE: st: stata 10 mp2 vs stata9***From:*vwiggins@stata.com (Vince Wiggins, StataCorp)

- Prev by Date:
**Re: RE: st: stata 10 mp2 vs stata9** - Next by Date:
**Re: st: importing LONG string variables** - Previous by thread:
**Re: RE: st: stata 10 mp2 vs stata9** - Next by thread:
**st: Creating a graph** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |