Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Program to drop insignificant variables from an equation


From   "Lachenbruch, Peter" <Peter.Lachenbruch@oregonstate.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Program to drop insignificant variables from an equation
Date   Wed, 20 Aug 2008 11:57:33 -0700

Two comments here:
1.	You can treat all dummies (or all dummies created from a single
variable) so they are either dropped or included - it looks like
sw,pe(.2): reg y x1 x2 x3 (dummies)

2.	If you have missing values, the stepwise will be run on complete
cases only.  This can result in severely biased estimates.  In one
example, I had 145 cases, but when I tried stepwise, I only had 77
variables.  This, of course, will be true of any regression command.  I
suggest using multiple imputation and then doing stepwise on each data
set.  The difference is striking.

In general, stepwise regression leads to a biased regression, so it's
been suggested by Tibshirani (hearsay evidence) that variables that
don't enter the equation be given coefficients of 0.  Another
alternative is LARS - there is a lars command written by Mander (I
almost type Lander!)

Tony

Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Maarten buis
Sent: Wednesday, August 20, 2008 5:49 AM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Program to drop insignificant variables from an
equation

--- XandeR XandeR <xanderx@mail.ru> wrote:
> I want to make a script that automatically one by one drops
> insignificant variables from a regression, starting from the most
> insignificant one. I am currently writing MSc dissertation and I am
> running regression with 3 variables and 312 dummies not all of which
> are significant. So it would be a great help if the STATA could do it
> for me.

When you drop a dummy variable the interpretation of the results change
dramatically as this will change the reference categorie. So I would
strongly recommend you do not do that. 

-- Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

Send instant messages to your online friends
http://uk.messenger.yahoo.com 
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index