[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Jeph Herrin <junk@spandrel.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: speed question: -collapse- vs -egen- |

Date |
Sat, 26 Apr 2008 11:08:42 -0400 |

Thanks to Stas, Sergei and Michael for some tips on speeding up things. Sergei's suggestion of a plugin falls victim to what Kit points out is the limitation of plugins - I'm running 64-bit Stata on 64-bit XP, so his plugin won't help me. In fact, since I develop a lot of code on my 32-bit desktop before running on the 64-bit machine, it's hardly worth my while to write my own plugins, even as a former C programmer. However, Kit has inspired me to try my hand at a Mata solution. thanks to all, Jeph Michael Blasnik wrote:

...

You can gain some speed in regular Stata code by not generating a separate variable just to count the number of non-missings:

bysort rep78: gen mean=sum(price)/sum(price<.)

by rep78: keep if _n==_N

On my machine, this reduces the time required for the corrected Stas code from 17.3 to 13.8 s.

Michael Blasnik

----- Original Message ----- From: "Sergiy Radyakin" <serjradyakin@gmail.com>

To: <statalist@hsphsun2.harvard.edu>

Sent: Friday, April 25, 2008 9:12 PM

Subject: Re: st: speed question: -collapse- vs -egen-

Hello All! Jeph has asked about an efficient way of creating a dataset with means of one variable over the categories of another variable. He suggested two possible solutions and Stas added a third one. Below I report performance of each of these methods and compare it with the fourth: a plugin. I use an expanded version of auto.dta and tabulate mean {price} by different levels of {rep78}. 1. All methods resulted in the following table of results* meanprice rep78 4564.5 1 5967.625 2 6429.233 3 6071.5 4 5913 5 2. The timing is as follows (Stata SE, Windows Server 2003, 32-bit) 1: 33.80 / 1 = 33.7960 2: 31.22 / 1 = 31.2190 3: 21.33 / 1 = 21.3280 4: 5.58 / 1 = 5.5780<snip> * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: speed question: -collapse- vs -egen-***From:*Jeph Herrin <junk@spandrel.net>

**Re: st: speed question: -collapse- vs -egen-***From:*"Stas Kolenikov" <skolenik@gmail.com>

**Re: st: speed question: -collapse- vs -egen-***From:*"Sergiy Radyakin" <serjradyakin@gmail.com>

**Re: st: speed question: -collapse- vs -egen-***From:*"Michael Blasnik" <michael.blasnik@verizon.net>

- Prev by Date:
**Re: st: SUR and a system of logit models** - Next by Date:
**st: Logs and graphics queries** - Previous by thread:
**Re: st: speed question: -collapse- vs -egen-** - Next by thread:
**RE: st: speed question: -collapse- vs -egen-** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |