Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Maarten Buis <maartenlbuis@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: where is StataCorp C code located? all in a single executable as compiled binary? |
Date | Tue, 20 Aug 2013 10:21:37 +0200 |
On Mon, Aug 19, 2013 at 7:30 PM, László Sándor wrote: > The other option seemed to be to try to keep track of the levels of > "bins", and just forval loop over the values, if-ing in a bin at a > time to quickly grab the means. This was surprisingly fast, and does > not seem to be any slower without a sort beforehand. Again, I am not > sure any efficiency of -bys- looping of ifs does not seem to be worth > the cost of the initial sorting. I think you are mixing up advise here: -by: <something>- is likely to be faster than a -forvalues- loop combined with -if- conditions. I don't think anyone suggested that you sort before that loop. The logic is that an -if- condition will each time by necesisty have to go through all observations. The alternative would be a single sort with -in- conditions, which I guess is what is at the core of the speed of the -by- prefix. Depending on how many times you want to use -if- conditions, there will be a point where the combination of a single -sort- and many -in- conditions will be quicker than many -if- conditions. But I don't expect that -sort-ing will help if you choose the -forvalues- loop combined with -if- conditions. On a pragmatic level: how much time have you now spent trying to write this code, and how much time do you expect to safe with that? Are you sure that you don't end up with a nett loss of time? -- Maarten --------------------------------- Maarten L. Buis WZB Reichpietschufer 50 10785 Berlin Germany http://www.maartenbuis.nl --------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/