Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: where is StataCorp C code located? all in a single executable as compiled binary?


From   László Sándor <[email protected]>
To   [email protected]
Subject   Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
Date   Fri, 23 Aug 2013 13:03:10 -0400

Thanks, Austin,
These were separate runs for the -bys- loops, all loading the same
unsorted data. So while I am not sure why the speed varies so much
(the hardware is said to be dedicated to this task completely), at
least I am very much impressed that it could ever get under 5 minutes,
8-10 times faster than -tab-, twice as fast as -collapse- etc. That
said, as you can see in the other thread,  the -statsby- run that
would make the results usable are immediately 5 times slower again…

Yes, maybe I should just try Mata. But it is hard to imagine that
dedicated, single purpose but automatically compiled code could beat
optimized C codes of -bys- (or at least -statsby-) built into Stata.

On Thu, Aug 22, 2013 at 5:58 PM, Austin Nichols <[email protected]> wrote:
> László Sándor <[email protected]> :
> I am guessing your -sort- takes 16 minutes, and each -by- calculation
> takes 4 or 5 minutes. The first -bysort- sorts the data; subsequent
> calls to -bysort- do not need to re-sort the data. Have you tried
> using Mata?
>
> On Thu, Aug 22, 2013 at 6:43 AM, László Sándor <[email protected]> wrote:
>> For those out there who care:
>>
>> I wonder why this thing is not more stable. I am confident that now I
>> am using all 64 cores on a node of a cluster with Stata/MP 13, with
>> plenty of RAM. I generate the 8 byte variables of 20 values with size
>> maxlong(). I use the same random sorting before running any of these
>> methods, and I try the sequence twice independently.
>>
>> Now -tab, sum()- got slower, it took roughly 36 minutes, all three
>> times I tried.
>> -collapse, fast- took less than 20-23 minutes.
>> The if loops took around 90-100 minutes.
>> The -bys- loop took only 20 minutes once, then LESS THAN 4.5 minutes
>> twice (with unsorted data?!).
>>
>> Of course, now the question is, why doesn't -tab- use the same optimizations…
>>
>> In any case, perhaps this was useful.
>>
>> Laszlo
>>
>> On Tue, Aug 20, 2013 at 4:19 PM, László Sándor <[email protected]> wrote:
>>> So, I reran the test on 8 cores, with Stata/MP 13, with 32 GB RAM.
>>>
>>> I made the following changes:
>>> 1. I maxed out the number of observations. (see -h limits- and -h maxlong-)
>>> 2. Made ten byte variables taking 20 integer values, this takes up 25
>>> GB out of the 32, close to the StataCorp recommendations of leaving
>>> 50% extra. But I did not check if virtual memory is touched, maybe I
>>> can scale dataset down a bit.
>>> 3. So I am taking 20 bins now, in case -tabulate, sum- and loops of
>>> -sum if, meansonly- scale differently.
>>> 4. I take only oneway tabs, as that's what I need, testing twoway was a mistake.
>>> 5. I also try a -bys bins:- "looping".
>>> +1. I mentioned I corrected Eric's code about not looping over all
>>> values that were "tabbed over". Now the two are comparable.
>>>
>>> In this setup,
>>> -- -tabulate, sum nof noobs nol nost- completes in only 1516.36
>>> seconds, or ~25 minutes.
>>> -- the simple frequency tab takes only 583.51 s, but again, this is
>>> not in the run.
>>> -- -collapse, fast- took 4025.64 seconds, much slower than -tab, sum-,
>>> very strange. (I am pretty sure I have exclusive use of this compute
>>> node, no other process is running or scheduling me).
>>> -- the if-loops took 3967s, shockingly comparable to -collapse, fast-,
>>> but still much slower than (now oneway) -tab, sum-.
>>> -- -bys bins: sum, meanonly- took 3205 s.
>>>
>>> So -tab, sum- is unbeatable on big data for oneway tabs with a
>>> moderate number of bins. Or others can run other tests.
>>>
>>> So I stick to parsing the log of -tab, sum-.
>>>
>>> Thanks for all your thoughts,
>>>
>>> Laszlo
>>>
>>> On Tue, Aug 20, 2013 at 5:08 AM, László Sándor <[email protected]> wrote:
>>>> Thanks, Maarten.
>>>>
>>>> My understanding of byable commands was that they loop over -if-
>>>> conditions anyway, though -in- conditions are supposed to be less
>>>> wasteful and would explain why the prefix requires sorted data.
>>>>
>>>> Trust me, this code is heavily used on big data, if each run can save
>>>> us minutes, it is still worth it. And my current tests with maxing out
>>>> the code in this thread with -maxlong()- number of observations (the
>>>> limit) and thus 20 GB of data gives a 20-minute lead to -collapse-
>>>> over -tab, sum-. However, the key comparison is with the loops here,
>>>> and I did not catch that the test was biased in their favor as they
>>>> did not loop over all observations. I am rerunning those tests now.
>>>>
>>>> On Tue, Aug 20, 2013 at 4:21 AM, Maarten Buis <[email protected]> wrote:
>>>>> On Mon, Aug 19, 2013 at 7:30 PM, László Sándor wrote:
>>>>>> The other option seemed to be to try to keep track of the levels of
>>>>>> "bins", and just forval loop over the values, if-ing in a bin at a
>>>>>> time to quickly grab the means. This was surprisingly fast, and does
>>>>>> not seem to be any slower without a sort beforehand. Again, I am not
>>>>>> sure any efficiency of -bys- looping of ifs does not seem to be worth
>>>>>> the cost of the initial sorting.
>>>>>
>>>>> I think you are mixing up advise here: -by: <something>- is likely to
>>>>> be faster than a -forvalues- loop combined with -if- conditions. I
>>>>> don't think anyone suggested that you sort before that loop. The logic
>>>>> is that an -if- condition will each time by necesisty have to go
>>>>> through all observations. The alternative would be a single sort with
>>>>> -in- conditions, which I guess is what is at the core of the speed of
>>>>> the -by- prefix. Depending on how many times you want to use -if-
>>>>> conditions, there will be a point where the combination of a single
>>>>> -sort- and many -in- conditions will be quicker than many -if-
>>>>> conditions. But I don't expect that -sort-ing will help if you choose
>>>>> the -forvalues- loop combined with -if- conditions.
>>>>>
>>>>> On a pragmatic level: how much time have you now spent trying to write
>>>>> this code, and how much time do you expect to safe with that? Are you
>>>>> sure that you don't end up with a nett loss of time?
>>>>>
>>>>> -- Maarten
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index