Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Tiago V. Pereira" <tiago.pereira@mbe.bio.br> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: How to perfom very simple manipulations in large data sets more efficiently |

Date |
Fri, 12 Aug 2011 11:43:23 -0300 (BRT) |

Dear statalisters, I have to perform extremely simple tasks, but I am struggling with the low efficiency of my dummy implementations. Perhaps you might have smarter ideas. Here is an example: Suppose I have two variables, X and Y. I need to the get value of Y that is associated with the smallest value of X. What I usually do is: (1) simple approach 1 */ ------ start -------- sum X, meanonly keep if X==r(min) local my_value = Y[1] */ ------ end -------- (2) simple approach 2 */ ------ start -------- sort X local my_value = Y[1] */ ------ end -------- These approaches are simple, and work very well for small data sets. Now, I have to repeat that procedure 10k times, for data sets that range from 500k to 1000k observations. Hence, both procedures 1 and 2 become clearly slow. If you have any tips, I will be very grateful. All the best, Tiago * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: How to perfom very simple manipulations in large data sets more efficiently***From:*Nick Cox <n.j.cox@durham.ac.uk>

**Re: st: How to perfom very simple manipulations in large data sets more efficiently***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**Re: st: Penalized/shrinkage estimators for probi** - Next by Date:
**Re: st: How to perfom very simple manipulations in large data sets more efficiently** - Previous by thread:
**st: seqlogit error: "No path in the tree leads to values .."** - Next by thread:
**Re: st: How to perfom very simple manipulations in large data sets more efficiently** - Index(es):