Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Does Blasnik's Law apply to -use-?


From   "Newson, Roger B" <r.newson@imperial.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Does Blasnik's Law apply to -use-?
Date   Tue, 18 Sep 2007 22:13:05 +0100

My query at the UK Stata User Meeting about the "_prefix suite of
commands" was limited strictly to commands for which help is accessible
by typing in Stata

whelp _prefix

which (at least on my Stata 9) will cause Stata to open an on-line help
window for a set of commands, which seem to be used internally by
StataCorp for implementing prefixes, such as -bootstrap-, -jackknife-,
-permute-, -nestreg-, -rolling-, -simulate-, -statsby-, -stepwise-, and
-svy-. This query was not intended to apply to all commands whose name
starts with "_".

As I understood it, Vince's warning was that non-StataCorp users who get
deeply involved with the commands described by typing

whelp _prefix

do so at their own risk, as these commands may be revised internally by
StataCorp without non-StataCorp users even being warned, never mind
being consulted. The reason for this (as I understood it) is that there
are complicated rules constraining what prefixes may appear before or
after what other prefixes. If a user invents a new prefix, then
StataCorp cannot be expected to maintain its existing prefixes so that
the new prefix will always interact with all the other prefixes as the
user would like. (Vince might like to correct me if I am wrong.)

I hope this helps.

Best wishes

Roger


Roger Newson
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk 
Web page: www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/pop
genetics/reph/

Opinions expressed are those of the author, not of the institution.

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Sergiy
Radyakin
Sent: 18 September 2007 17:55
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Does Blasnik's Law apply to -use-?

Hello Roger,

could you please summarize what was the warning about? And, in
particular, whether it relates to "_prefix"-commands or to "_"-prefix
commands (where  "_prefix_expand" would be an example of the former
and "_regress" an example of the latter).

Though the help for the "_prefix"-commands seems to be interesting, I
find it more exciting to learn about the commands which are not only
not documented, they are not even mentioned anywhere, not even in the
internet (google currently returns 0 links). Does anyone has an idea
of how the "_xt..." commands work? I mean these:

_xtarm
_xtmka
_xtmkz
_xtzw
_xtwhw
_xta2

Does anyone has a complete list of _all Stata commands and is willing
to present it to the community?

Thank you,
   Sergiy





On 9/16/07, Newson, Roger B <r.newson@imperial.ac.uk> wrote:
> Thanks to David Elliot, Mike Blasnik and David Airey for their very
> helpful and detailed replies to my query. These shall be used to
inform
> the first Stata 10 update to -parmby-, when I have Stata 10.
>
> And thanks also to Vince Wiggins, who warned me (during the 13th UK
> Stata User Meeting last week) of the dangers of ordinary users trying
to
> get too deep into the undocumented _prefix suite of commands, used
> internally by StataCorp for -statsby- and other prefixes. (In Stata,
> type
>
> whelp _prefix
>
> to find out more about these.)
>
> Best wishes
>
> Roger
>
>
> Roger Newson
> Lecturer in Medical Statistics
> Respiratory Epidemiology and Public Health Group
> National Heart and Lung Institute
> Imperial College London
> Royal Brompton campus
> Room 33, Emmanuel Kaye Building
> 1B Manresa Road
> London SW3 6LR
> UNITED KINGDOM
> Tel: +44 (0)20 7352 8121 ext 3381
> Fax: +44 (0)20 7351 8322
> Email: r.newson@imperial.ac.uk
> Web page: www.imperial.ac.uk/nhli/r.newson/
> Departmental Web page:
>
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/pop
> genetics/reph/
>
> Opinions expressed are those of the author, not of the institution.
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of David
Elliott
> Sent: 14 September 2007 15:07
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: Does Blasnik's Law apply to -use-?
>
> Being Stata users, we should approach this in a rigorous scientific
> fashion:
>
> X-----begin-----X
>
> program define intest
> version 9.0
>
> *! version 1.0.0  2007.09.13
> *! Simulate using part of file with in #/##
> *! by David C. Elliott
> *!
> *! using name of trial dataset
> *! postname specifies filename of postfile
> *! numblocks is number of file blocks to create
>
>
> syntax using/ ,POSTname(string) NUMblocks(int)
>
> local more `c(more)'
> set more off
>
> use `using', clear //Load first to eliminate any first pass caching
> effects
> local recblock = round(`c(N)'/`numblocks',1)
>
> tempname post
> postfile  `post' double block float timein timeif using `postname',
> every(10) replace
>
> timer clear 1
> n di _n(2) "{txt}{col 11}{center 10:-- IF --}{center 10:-- IN --}" _n
> ///
>  "{center 10:Block}{center 10:Time}{center 10:Time}" _n ///
>  "{hline 30}"
> local lastblock = `c(N)' - `recblock'
> forvalues i=1(`recblock')`lastblock ' {
>        local block = `i'
>        foreach I in if in {
>                if "`I'" == "in" {
>                        local ifin in `i'/`=`i'+`recblock''
>                        }
>                        else {
>                                local ifin if inrange(_n, `i',
> `=`i'+`recblock'')
>                                }
>                timer on 1
>                use `using' `ifin', clear
>                timer off 1
>                qui timer list 1
>                local time`I' :display %5.2f round(`r(t1)',.01)
>                timer clear 1
>                }
>        post `post' (`block') (`timein') (`timeif')
>        n di "{res}{ralign 10:`block'}{ralign 10:`timeif'}{ralign
> 10:`timein'}"
>        }
> postclose `post'
> set more `more'
> use `postname', clear
> lab var block "Record Block"
> lab var timein "Load Time using IN"
> lab var timeif "Load Time using IF"
> tw line timein block || line timeif block
> end
>
> X-----end-----X
>
> eg:
>
> . intest using dss_data_06_07.dta , postname(intest.dta)
numblocks(100)
>
>
>           -- IN --  -- IF --
>  Block      Time      Time
> ------------------------------
>         1      0.64      0.88
>     17278      0.47      0.77
>     34555      0.47      0.77
>     51832      0.47      0.78
>     69109      0.45      0.78
>     86386      0.45      0.78
>    103663      0.47      0.78
>    120940      0.47      0.77
>  ...
>
> This adofile will run an -if- versus -in- simulation and graph the
> results.  From my findings I can confirm a speed advantage of about
> 50% using -in- on dataset with obs:1,727,673 vars:28 size:266,061,642
>
> However, things get murkier.  Run a simulation, then max out Stata's
> memory setting with as much memory as the system will give you and run
> the simulation again.  When you do this, you eliminate the system's
> ability to cache the file.  Ordinarily, subject to filesize and
> available memory, Stata may be reading the file from cache.  If this
> is the case, one will see an advantage to using -in-.  However, if the
> caching advantage is eliminated by increasing Stata memory, my
> simulations show the speed reduction using -in- is negated.  I also
> tested this on large network databases and was unable to demonstrate
> any advantage to -in-.
>
> So back to Roger's initial question.  It would appear that for
> cacheable filesizes and large numbers of bygroups a strategy using
> -in- might be feasible.  There is an overhead penalty of setting up
> the bygroups to make them selectable using -in- involving sorts and
> the like.  For a small number of bygroups the speed advantages might
> be lost, but for many levels and a large number of iterations there
> would be an advantage.
>
> DC Elliott
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index