Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: how to parallelize Mata (or steal the performance of built-in -tab, summarize-)


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: how to parallelize Mata (or steal the performance of built-in -tab, summarize-)
Date   Tue, 3 Apr 2012 10:01:48 +0100

Overnight I remembered -binsm-

SJ-6-1  gr26_1  . . . . . . . . . . . . . . . . . .  Software update for binsm
        (help binsm if installed) . . . . . . . . . . . . . . . . .  N. J. Cox
        Q1/06   SJ 6(1):151
        rewritten to support modern Stata graphics

STB-37  gr26  . . . . . . . . . . . Bin smoothing and summary on scatter plots
        (help binsm if installed) . . . . . . . . . . . . . . . . .  N. J. Cox
        5/97    pp.9--12; STB Reprints Vol 7, pp.59--63
        alternative to graph, twoway bands(); produces a scatterplot
        of yvar against xvar with one or more summaries of yvar for bins
        of xvar

and -twoway__histogram_gen-

SJ-5-2  gr0014  . . . . . . . Stata tip 20: Generating histogram bin variables
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. A. Harrison
        Q2/05   SJ 5(2):280--281                                 (no commands)
        tip illustrating the use of twoway__histogram_gen for
        creation of complex histograms and other graphs or tables

My strategic advice is this. You want a reduced dataset for graphing,
so -drop- aggressively. Once you have identified observations "to
use", go

keep if `touse'
drop `touse'

Once the mean is in the last observation of every block of
observations, -drop- all the others.


2012/4/3 László Sándor <sandorl@gmail.com>:
> Thanks for this, Nick.
>
> I found my (plenty and embarrassing) mistakes in my code, below is a
> neater version that also actually does what it should, or so it seems.
>
> That said, it is still rarely faster than logging -tab, sum()- though
> with many millions of observations, running on many (>4) cores, it at
> least has a little advantage. (But both beat my bare bones Mata
> attempts.)
>
> I would still be a bit curious how secret the secret sauce of
> StataCorp is for this, as this "collapsing" is pretty commonplace for
> many descriptives (also bar graphs, line graphs etc), and while they
> are rightly proud if they could tweak -tabulate- to run this fast,
> they perhaps could let us (and themselves) working towards other
> similar code also running faster. Though, of course, there must be a
> reason (general purpose etc.) while this is harder elsewhere.
>
> Thanks again,
>
> Laszlo
>
> tempvar wsum tag
>
> if ("`y2_var'"!="") local y2 y2
> else local y2 ""
>
> sort `x_q' `touse'
> by `x_q' `touse': g byte `tag' = _n == _N
> if ("`weight1'"!="") by `x_q' `touse': g `wsum' = sum(`weight1')
> else by `x_q' `touse': g `wsum' = _N
>
> foreach v in x y `y2' {
>        if ("`weight1'"!="") by `x_q' `touse': g ``v'_mean' = sum(``v'_r'*`weight1')
>        else by `x_q' `touse': g ``v'_mean' = sum(``v'_r')
>
>        quietly replace ``v'_mean' = cond(`tag' & `touse',``v'_mean'/`wsum',.)
> }
>
> On Mon, Apr 2, 2012 at 6:11 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>>
>> I will look at it tomorrow.
>>
>> 2012/4/2 László Sándor <sandorl@gmail.com>:
>> > Nick,
>> >
>> > thanks, I did follow up with your post. Sadly, I could not easily get
>> > -by- working, or to be precise, to use the variables that it
>> > generated. Below I have an attempt, if I can take liberty with your
>> > time and expect you to parse it, I am grateful for comments to get it
>> > working -- the indexing must be off. It tries to average two (x_r and
>> > y_r) or three (y2_r extra) variables. It generates too large values
>> > for some bins (i.e. from U[0,1] variables some averages become larger
>> > than 20.)
>> >
>> > I am happy if someone from StataCorp follows up too! :)
>> >
>> > Thanks,
>> >
>> > László
>> >
>> > tempvar wsum tag ones
>> > g byte `ones' = 1
>> >
>> >
>> > if ("`y2_var'"!="") local y2 y2
>> > else local y2 ""
>> >
>> >
>> > if ("`weight1'"!="") g `wsum' = sum(`weight1')  if `touse'
>> > else g `wsum' = sum(`ones')  if `touse'
>> >
>> >
>> > sort `x_q'
>> > by `x_q': g byte `tag' = _N if `touse'
>> >
>> > foreach v in x y `y2' {
>> > if "`weight1'"!=""{
>> > by `x_q': g ``v'_mean' = sum(``v'_r'*`weight1')  if `touse'
>> > by `x_q': replace ``v'_mean' = ``v'_mean'/`wsum' if `tag' & `touse'
>> > }
>> >
>> > else {
>> > by `x_q': g ``v'_mean' = sum(``v'_r') if `touse'
>> > by `x_q': replace ``v'_mean' = ``v'_mean'/`wsum' if `tag' & `touse'
>> > }
>> > }
>> >
>> >
>> > On Mon, Apr 2, 2012 at 3:36 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>> >>
>> >> We are back to the questions you asked a week ago. Mostly this is for
>> >> StataCorp. Otherwise please see again my answers at
>> >>
>> >> http://www.stata.com/statalist/archive/2012-03/msg01144.html
>> >>
>> >> I've had dramatic speed-ups with Mata -- my record is reducing
>> >> execution time from 5 days to 2 minutes, but that was partly because
>> >> my original code was so dumb -- but I've not tried anything like the
>> >> stuff you were using.
>> >>
>> >> -tabulate, summarize- is compiled C code. I think the nearest you can
>> >> get is by using -by:- as explained in the post just quoted.
>> >>
>> >> Nick
>> >>
>> >> 2012/4/2 László Sándor <sandorl@gmail.com>:
>> >> > Hi all,
>> >> >
>> >> > I had several questions recently on this list about compiling Mata
>> >> > code. I still could not deal with generating the compile time locals
>> >> > with loops, but I typed them out and compiled. Now I had my test runs
>> >> > but they are surprising. Let me ask you why:
>> >> >
>> >> > My basic problem was to do a fast "collapse" to make binned scatter
>> >> > plots. Collapse was unacceptably slow, probably because of the
>> >> > necessary preserve-restore cycles, or inefficient coding of collapse
>> >> > (for its general purpose).
>> >> >
>> >> > I already had a version that parsed a log of -tabulate, summarize-.
>> >> > Yes, it is as much of a hack as it sounds like. I was not expecting
>> >> > this to be fast, at least because of the file I/O and the parsing.
>> >> >
>> >> > Now I built a Mata function that "collapses" into new variables with
>> >> > leaving the data intact otherwise. For this I used Ben Jann's
>> >> > -mf_mm_collapse-, and compiled all the necessary functions myself in
>> >> > the ado file.
>> >> >
>> >> > And the test run with 100 million observations told me it was slower
>> >> > than the hack. Before I give up and claim the hack unbeatable, I have
>> >> > one suspicion. I had the test run on Stata 12 MP on a cluster, with
>> >> > 12
>> >> > cores. Perhaps -tabulate- used all of them, and my code did not.
>> >> >
>> >> > Are there guidelines how to speed up Mata in this situation (if it is
>> >> > not MP-aware to begin with?).
>> >> >
>> >> > Or, tentatively, can I ask for some guidance about the magic of
>> >> > -tabulate, summarize-? Is that magic accessible/reproducible without
>> >> > just logging its output?
>> >> >

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index