# Re: st: non-normal residual

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject Re: st: non-normal residual Date Fri, 30 Apr 2010 00:12:18 -0700 (PDT)

```--- On Fri, 30/4/10, Fabio Zona wrote:
> how should one run a regression, if residuals
> (errors) are not normally distributed (es for P-P plot and
> Q-Q plot) and tranformation of Y (i.e., log, square
> root..and the like) does not help to improve this
> distrubution of residuals??

I most cases I would just run -regress-, there are cases
where the non-normality might give some hints on better
ways of modeling your data, e.g.:
o A spike, a single value that received a lot of
observations. Especially if that spike occurs at a
meaningful value (often 0) you might want to consider
a two part model, like -tobit-, -heckman-, or -zip-.
o Bimodality can happen when you left out an important
dummy variable. An example of that can be found here:
<http://www.maartenbuis.nl/software/hangroot.html>
If you add the union dummy, the bimodality disappears.
o All observations lumped together at a few values
indicates that your depedent variable takes only a few
values. In this case you might want to consider an
ordinal model, especially if you are unsure whether the
distances between the categories represented by these
values are close to equal, which is what you will
typically assume when using -regress- on a ordinal
variable.
o I am sure others can come up with more scenarios...

In short, it depends on the kind of non-normality. There
is however no magic (black, white, or otherwise) involved
in figuring out whether non-normality is a problem or not,
it is mostly using common sense to figure out what is
causing it, and figuring out if the process that is causing
the non-normality contains information that you want to
use or want to ignore.

Hope this helps,
Maarten

```