 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

Re: st: Zero Inflated Negative Binomial model

 From David Hoaglin To statalist@hsphsun2.harvard.edu Subject Re: st: Zero Inflated Negative Binomial model Date Sun, 22 Jan 2012 09:26:34 -0500

The exploratory step that I sketched earlier is closely related to a
hurdle model.  A brief discussion in the book by Agresti (2010) cites
papers by Saei et al. (1996) and Min and Agresti (2005).  For the
nonzero categories a cumulative logit model might work (as in ordinal
logistic regression), and you could try other cumulative link
functions.

References

Agresti, A (2005). Analysis of Ordinal Categorical Data, second edition.  Wiley.

Min Y, Agresti A (1996). Random effect models for repeated measures of
zero-inflated count data.  Statistical Modeling 5:1-19.

Saei A, Ward J, McGilchrist CA (1996). Threshold models in a methodone
programme evaluation.  Statistics in Medicine 15:2253-2260.

David Hoaglin

On Sat, Jan 21, 2012 at 8:06 AM, David Hoaglin <dchoaglin@gmail.com> wrote:
> Eugene,
>
> You are correct that using a ZINB model would be problematic.  The NB
> distribution applies to counted data (i.e., it is possible for any
> nonnegative count to occur in the outcome variable).  When you have
> only categories, that requirement is not satisfied, no matter what
> value you choose to represent each category.
>
> I don't know whether the ordinal logit model has a zero-inflated
> version (I have not searched).  Here "zero-inflated" would mean that
> the first category is inflated, since numerical values associated with
> the ordered categories are only labels.  If someone has worked out
> such a model, you would still need to determine whether, in your data,
> the assumption of proportional odds is reasonable.  You could try an
> ordinal logistic regression model with your data as they stand, and
> see what happens.
>
> As an exploratory step, you could fit a binary logit model to "0
> times" versus "1 or more times"; that would address the question of
> crossing the threshold into self-injurious behavior.  You could then
> work with only the nonzero categories and dichotomize the outcome
> variable at each of the category boundaries (or some of them) and fit
> a binary logit model to each dichotomized outcome.  Comparison of the
> coefficients on the predictor variables among those models would give
> you an indication of whether the proportional odds model is
> reasonable.
>
> You didn't describe the sorts of predictor variables that you have.
> Other analytic approaches may be possible.
>
> David Hoaglin

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/