# Re: st: regression with dependent variable ranging from 0 to 1

 From "Austin Nichols" To statalist@hsphsun2.harvard.edu Subject Re: st: regression with dependent variable ranging from 0 to 1 Date Tue, 30 Dec 2008 14:42:58 -0500

```Andrea Rispoli <andrea.rspl@gmail.com>:
It looks like you have computed a normalized Herfindahl index since
the traditional index ranges from 1/N to 1, not zero to one.  So your
variable will be zero if there are 2 firms with equal shares or 50
firms with equal shares, though these are not analogous cases in terms
of Cournot oligopoly output.  If these are actually firms in markets,
however, you will generically not get any actual zeros.

It looks like many of your observations are at least very close to
zero, since the mean is so much smaller than the standard deviation,
so I am guessing the outcome is very skewed.  If you want to stick
with your normalized index you may want to model the expected index as
exp(Xb) using a log link in -glm- or the -poisson- command.

Given you have 213,620 obs I am guessing these are individual firms,
not markets, so many obs share the same value of H, since they are in
the same market, which means you should certainly cluster on market if
you stick with the data you have.  But note that the normalized H (sum
of squared shares less 1/N, over 1-1/N) is the sample variance of
shares, so you are estimating a variance of individual outcomes within
a market, and you might be better off with some kind of mixed model
using the raw shares rather than the H construct.

On Tue, Dec 30, 2008 at 2:16 PM, Maarten buis <maartenbuis@yahoo.co.uk> wrote:
> --- Andrea Rispoli <andrea.rspl@gmail.com> wrote:
>> It is an Herfindahl index of concentration, it ranges from 0 to 1 (in
>> principle) : in my specific case:
>>
>> Variable |       Obs        Mean    Std. Dev.       Min        Max
>>
> -------------+--------------------------------------------------------
>>  H         |    213620    .0190621    .0920916          0   .6477536
>
> How many zeros do you have? ( type in Stata: -count if H == float(0)- )
> Even though it is possible for a fractional logit to model a dependent
> variable that includes zero (and one), if there are too many of these,
> then that might indicate that these zeros occur due to a separate
> process and need to be modeled separately.
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```