Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: where is StataCorp C code located? all in a single executable as compiled binary?

From	László Sándor <[email protected]>
To	[email protected]
Subject	Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
Date	Tue, 20 Aug 2013 05:08:29 -0400

Thanks, Maarten.

My understanding of byable commands was that they loop over -if-
conditions anyway, though -in- conditions are supposed to be less
wasteful and would explain why the prefix requires sorted data.

Trust me, this code is heavily used on big data, if each run can save
us minutes, it is still worth it. And my current tests with maxing out
the code in this thread with -maxlong()- number of observations (the
limit) and thus 20 GB of data gives a 20-minute lead to -collapse-
over -tab, sum-. However, the key comparison is with the loops here,
and I did not catch that the test was biased in their favor as they
did not loop over all observations. I am rerunning those tests now.

On Tue, Aug 20, 2013 at 4:21 AM, Maarten Buis <[email protected]> wrote:
> On Mon, Aug 19, 2013 at 7:30 PM, László Sándor wrote:
>> The other option seemed to be to try to keep track of the levels of
>> "bins", and just forval loop over the values, if-ing in a bin at a
>> time to quickly grab the means. This was surprisingly fast, and does
>> not seem to be any slower without a sort beforehand. Again, I am not
>> sure any efficiency of -bys- looping of ifs does not seem to be worth
>> the cost of the initial sorting.
>
> I think you are mixing up advise here: -by: <something>- is likely to
> be faster than a -forvalues- loop combined with -if- conditions. I
> don't think anyone suggested that you sort before that loop. The logic
> is that an -if- condition will each time by necesisty have to go
> through all observations. The alternative would be a single sort with
> -in- conditions, which I guess is what is at the core of the speed of
> the -by- prefix. Depending on how many times you want to use -if-
> conditions, there will be a point where the combination of a single
> -sort- and many -in- conditions will be quicker than many -if-
> conditions. But I don't expect that -sort-ing will help if you choose
> the -forvalues- loop combined with -if- conditions.
>
> On a pragmatic level: how much time have you now spent trying to write
> this code, and how much time do you expect to safe with that? Are you
> sure that you don't end up with a nett loss of time?
>
> -- Maarten
>
> ---------------------------------
> Maarten L. Buis
> WZB
> Reichpietschufer 50
> 10785 Berlin
> Germany
>
> http://www.maartenbuis.nl
> ---------------------------------
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>

References:
- st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: "Roger B. Newson" <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: "Roger B. Newson" <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: "Eric A. Booth" <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: Phil Clayton <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: "Roger B. Newson" <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: Phil Clayton <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: Nick Cox <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: Nick Cox <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: Nick Cox <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: Nick Cox <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: László Sándor <[email protected]>
- Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
  - From: Maarten Buis <[email protected]>

Prev by Date: st: Use lag operators with interaction term, is it correct?
Next by Date: RE: st: generating dummy variables based on freq of duplicate values
Previous by thread: Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
Next by thread: Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
Index(es):
- Date
- Thread