Re: st: Selecting Best Regression Equation using STATA

 From Ron�n Conroy <[email protected]> To "statalist hsphsun2.harvard.edu" <[email protected]> Subject Re: st: Selecting Best Regression Equation using STATA Date Tue, 06 Apr 2004 11:19:27 +0100

```on 05/04/2004 19:34, wildscop at [email protected] wrote:

>
> Does STATA provide any command to form any of the following procedures to
> find Best Regression Equation -

Stata's strength lies in its user-contributed procedures. In this case, the
best regression equation is of the form

Response variable is a function of the theory you are using informed by
interacting with the data, adjusted for confounding variables.

I don't mean to be flippant, but the construction of a regression equation
is like the investigation of a crime. Many leads need to be checked out, and
it has to fit the known data but, most important of all, it has to make
sense. The trouble with 'automatic' methods is that they don't know any
science, and cannot be expected to.

And what it there is no science? What if you just have a whole bunch of
predictors and want to weed out the redundant ones? The trouble is model
shrinkage. The model you build will over-estimate the ability of the
predictor variables. If applied to a new sample of data, the fit of any
model will drop by a factor that is the ratio of adjusted-R-squared to
R-squared (for least squares models - other models the calculation is more
complex). However, for stepwise models and best subset models the adjusted
R-squared should be calculated based on *all* the predictors that were
*considered* for inclusion, not just the ones that were included. With many
predictors, this can mean that your model shrinkage makes it useless a
priori.

Ronan M Conroy ([email protected])
Lecturer in Biostatistics
Royal College of Surgeons
Dublin 2, Ireland
+353 1 402 2431 (fax 2764)

--------------------
Just say no to drug reps
http://www.nofreelunch.org/

