Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: All-possible-regressions procedure

Subject   Re: st: RE: All-possible-regressions procedure
Date   Fri, 19 Sep 2003 10:07:05 -0700


Thanks for your response.  I'm looking for something analogous to the SAS
command (I forget what it is exactly), which selects the "best" # (you
specify the #) of models using 1 covariate, 2 covariates, etc.  The
investigator then explores the resulting models.  It's not really an
automatic procedure in the sense of forward or backward selection.  Is that
what  -allpossible- does?

I know it's not feasible to give me a short course in multiple regression,
but what is your basic philosophy when whittling down potential explanatory
variables when doing an explanatory model (as opposed to a predictive


                    "Nick Cox"                                                                                                 
                    <>           To:     <>                                  
                    Sent by:                         cc:                                                                       
                    owner-statalist@hsphsun2.h       Subject:     st: RE: All-possible-regressions procedure                   
                    09/19/2003 09:43 AM                                                                                        
                    Please respond to                                                                                          

> Does Stata have an all-possible-regressions procedure, for
> use in model
> building in multiple regression?

Sort of.

I wrote a program -allpossible-
which does this to a limited extent.
It's really a wrapper that runs
lots of regressions or similar
and prints out selected results, but
it has strict limits.

I also wrote a program -selectvars-
which is, indirectly, one of the
tools you need if you want to knit
your own and cycle through subsets
of predictors.

Both are on SSC.

However, I have not written anything
to select "best" regressions in any
sense whatsoever, not only because
that is much more difficult to program,
but also because it's against my religion.
(I am not aware anyone else did either.)

In fact, before someone starts thinking
"gun manufacturer" I will stress that my
own motivation for writing -allpossible-
was the complete opposite of what I take
to be a common motivation for using
such programs. I wanted to show that
a variety of models were, in a class of
problems, almost equally good in terms
of various criteria. The naive opposite
is, naturally, the idea that one can
automate the selection process completely.

I'll add a standard rider. No such program
can cope adequately with the combinatorial
explosion(*) of possibilities. Even
trimming matters down to whether each
predictor is in or out of the model,
then 20 predictors give you 2^20 ~ 10^6
models, and it's pretty hard to keep
track of a million sets of model results.
(20 predictors is perhaps a small model
by many analysts' standards). Even 2^10
is more than I want to compare.


(*) A nice term. When was it introduced?

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index