[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Opinions on fractional logit versus tobit - prediction and model fit

From   Stas Kolenikov <>
Subject   Re: st: Opinions on fractional logit versus tobit - prediction and model fit
Date   Fri, 3 Apr 2009 08:57:18 -0500

On 4/2/09, Eva Poen <> wrote:
>  - Although it appears to be a very elegant solution, some people say
>  that FLM is not well suited for problems with a lot of zeros or ones;
>  for example, Maarten Buis said so in this post (but didn't provide a
>  reference):
>  If someone knows any references where this is discussed, I'd be
>  grateful to receive them.

If you have figured out -gllamm-, then you might be able to use it to
set up a mixture/zero-inflation model with two-point distribution of
the latent variable, using -ip(f) nip(2)- options for the relevant
part of the model. I would probably be more convinced if you had full
panels that consist of zeroes, and other panels that have a mixture of
0s and non-zeroes, rather than each panel having 5 zeroes and one or
two non-zeroes, since zero-inflation models are essentially stating
that an individual is either in "don't-do-it" class with zero outcome,
or "do-it-sometimes" class with zero outcomes coming in a random way
along with non-zeroes. See -zip- for a canned routine doing this in
official Stata.

And btw it might be worth looking at -xt[me]poisson- if your data are
integers. See if interpreting your dependent variable as a count is at
least an approximately reasonable interpretation in your application.

>  - I am getting sensible estimates for the random effects with the
>  tobit approach, and not so sensible ones with FLM. In fact, FLM
>  estimates two of the three to be zero. Is this a sign of my model
>  being incorrectly specified, or could it be a sign of FLM not handling
>  the zeros and ones very well?

As far as I know (and you should not over-rely on this :)), it is the
tobit model that is usually behaving in a weird way, as it is quite
fragile to the violations of normality assumptions. So I probably
wouldn't put too much value into this kind of comparison; in all
likelihood, BOTH models are misspecified, you just need to find the
one that is more reasonable than the other :)). Dig for George Box's
quote on this!

Stas Kolenikov, also found at
Small print: I use this email account for mailing lists only.
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index