Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: faster xtiling


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: faster xtiling
Date   Fri, 7 Sep 2012 17:50:15 +0200

On Fri, Sep 7, 2012 at 5:04 PM, László Sándor wrote:
> I am trying to speed up -xtile- for Stata 11 and above for all
> platforms (for internal use) used with tens of millions of
> observations.
>
> I checked the source of -xtile-, and I am not sure I understand all
> its purpose. Most importantly, it does sort the data (a no-no with
> data the size of mine), even though the crucial step of _pctile does
> not need presorted data.

The sorting only happens if you asked for more than 1,001 quantiles,
so that suggests to me that there is some limitation in _pctile that
makes that necessary. If it were just laziness/sloppiness than it
would be extremely unlikely that the code would have been written that
way.

> And while I am at it, I am also happy to hear comments about the
> prospects of using Mata for any of this. _pctile is built-in,
> optimized, tailored, tweaked, polished C code, so there is little hope
> that Mata might improve the crucial steps, right?

As to the properties of -pctile, only StataCorp can say anything about
that, as we cannot see its content any more than you can.

-- Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index