Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: where is StataCorp C code located? all in a single executable as compiled binary?


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: where is StataCorp C code located? all in a single executable as compiled binary?
Date   Mon, 19 Aug 2013 16:55:52 +0100

-levelsof- is not an equivalent to -bysort:- in any sense.

What -levelsof- does best in my view (some declaration of interest
implicit here) is

1. Show compactly the distinct levels of some variable in the data.
(If a variable has numerous distinct levels, a compact listing is
difficult to achieve and less likely to be very interesting or
useful.)

2. Put those levels into a macro, as one way of easing looping over
all those levels, namely as a precursor to -foreach-. This use can be
quick-and-dirty in terms of user time, or dirty-but-quick.

-levelsof- often sorts the data and then may sort back. So, you may
well need to follow -levelsof- with another -sort-.

The implications I take to be that for problems like  László's with
big datasets and a real worry about speed -levelsof- is unlikely to
help. Speed is neither an aim nor a side-effect of -levelsof-.

As for "why isn't it used more often?" I have no precise data on its
use but guessing from usage in Statalist examples it seems to be about
as popular as it deserves, indeed a bit more popular than it deserves.

Nick
[email protected]


On 19 August 2013 16:21, László Sándor <[email protected]> wrote:
> But of course, the fastest example above is cheating a bit, as it know
> the values of v1 and v2. A simple -bysort- to circumvent that would
> immediately punish us heavily with sorting dozens of gigabytes.
>
> But-but-but, my main use case uses the discrete values of a variable.
> Is -levelsof- faster than -bys- (then why isn't it used more often?).
>
> Or as in most cases the discrete values come from a previous xtiling,
> I know the value of this variable, or might even keep track of the
> quantiles in a local somewhere.
>
> Thanks for any thoughts on speeding up binned averaging.
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index