Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: non-normal residual


From   Maarten buis <maartenbuis@yahoo.co.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: non-normal residual
Date   Fri, 30 Apr 2010 00:12:18 -0700 (PDT)

--- On Fri, 30/4/10, Fabio Zona wrote:
> how should one run a regression, if residuals
> (errors) are not normally distributed (es for P-P plot and
> Q-Q plot) and tranformation of Y (i.e., log, square
> root..and the like) does not help to improve this
> distrubution of residuals??

I most cases I would just run -regress-, there are cases 
where the non-normality might give some hints on better
ways of modeling your data, e.g.:
o A spike, a single value that received a lot of 
  observations. Especially if that spike occurs at a 
  meaningful value (often 0) you might want to consider
  a two part model, like -tobit-, -heckman-, or -zip-.
o Bimodality can happen when you left out an important
  dummy variable. An example of that can be found here:
  <http://www.maartenbuis.nl/software/hangroot.html>
  If you add the union dummy, the bimodality disappears.
o All observations lumped together at a few values 
  indicates that your depedent variable takes only a few
  values. In this case you might want to consider an 
  ordinal model, especially if you are unsure whether the
  distances between the categories represented by these
  values are close to equal, which is what you will  
  typically assume when using -regress- on a ordinal
  variable.
o I am sure others can come up with more scenarios...

In short, it depends on the kind of non-normality. There
is however no magic (black, white, or otherwise) involved
in figuring out whether non-normality is a problem or not,
it is mostly using common sense to figure out what is 
causing it, and figuring out if the process that is causing
the non-normality contains information that you want to
use or want to ignore. 

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------



      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index