# Re: RE: st: What multiple regression model for extreme distributions

 From Robert A Yaffee To statalist@hsphsun2.harvard.edu Subject Re: RE: st: What multiple regression model for extreme distributions Date Tue, 02 Feb 2010 13:14:38 -0500

```Muhammed,
You might want to try a Weibull regression, a Frechet or
Gumbel regression.  If you could formulate a regression with
the extreme value distribution, and adjust the location, scale,
and shape parameters to fit your model that might be a good option.
Alternatively, you could run a Bayesian model with one of those
as a conjuage prior if you don't want to use a noninformative prior.
Cheers,
Bob

Robert A. Yaffee, Ph.D.
Research Professor
Silver School of Social Work
New York University

Biosketch: http://homepages.nyu.edu/~ray1/Biosketch2009.pdf

CV:  http://homepages.nyu.edu/~ray1/vita.pdf

----- Original Message -----
From: Maarten buis <maartenbuis@yahoo.co.uk>
Date: Tuesday, February 2, 2010 10:31 am
Subject: RE: st: What multiple regression model for extreme distributions
To: stata list <statalist@hsphsun2.harvard.edu>

> > I have a household income survey data ( 38,000 observations), and my
> > problem is doing a multiple regression on saving ( independent var)
> to
> > ethnicity/strata/employmenttc( dependent var).
> >
> > The problem is this : 70% of my observation for the value of saving
> is
> > zero. I had recode it to 1 and log them, but the distribution is still
> > extremely skewed ( mean 0.78, std dev is 2.4  min 0 max 14). The
> > historgam still looks like the letter L , exteremly skewed to the
> > right with  long tail.  Obviously, OLS is out, and I tried Poisson(
> > glm nbinomial) but the distribution is still not distributed normally.
> > The data are in order i.e no missing values etc etc. It is clean.For
> > some reason, lobit would not run.
>
> One option is you could fit a -zip- to the original saving variable, and
> use the -robust- option. That way you are modeling the mean saving as
> a
> two-step proces, first a person decides whether or not to save, after
>
> that the persons that do save decide their amount. The influence of the
> explanatory variables on the explained variable occurs through the log
> link function, so you avoid the problem that you are no longer modeling
> the mean savings when you first take the log of savings and than model
>
> that transformed variable. Notice that by using the -robust- option you
> are only making the assumption that your model for the mean is correct,
> and that you are not making additional distributional assumptions.
>
> Hope this helps,
> Maarten
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
> http://www.maartenbuis.nl
> --------------------------
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```