Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Hoaglin <dchoaglin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Zero Inflated Negative Binomial model |

Date |
Sat, 21 Jan 2012 08:06:53 -0500 |

Eugene, You are correct that using a ZINB model would be problematic. The NB distribution applies to counted data (i.e., it is possible for any nonnegative count to occur in the outcome variable). When you have only categories, that requirement is not satisfied, no matter what value you choose to represent each category. I don't know whether the ordinal logit model has a zero-inflated version (I have not searched). Here "zero-inflated" would mean that the first category is inflated, since numerical values associated with the ordered categories are only labels. If someone has worked out such a model, you would still need to determine whether, in your data, the assumption of proportional odds is reasonable. You could try an ordinal logistic regression model with your data as they stand, and see what happens. As an exploratory step, you could fit a binary logit model to "0 times" versus "1 or more times"; that would address the question of crossing the threshold into self-injurious behavior. You could then work with only the nonzero categories and dichotomize the outcome variable at each of the category boundaries (or some of them) and fit a binary logit model to each dichotomized outcome. Comparison of the coefficients on the predictor variables among those models would give you an indication of whether the proportional odds model is reasonable. You didn't describe the sorts of predictor variables that you have. Other analytic approaches may be possible. David Hoaglin On Fri, Jan 20, 2012 at 8:02 AM, Eugene Walls <Eugene.Walls@du.edu> wrote: > I am working with a dataset that contains counts of the number of times that youth in the sample engage in self-harming behaviors (such as cutting). My co-authors and I are interested in using the zero-inflated negative binomial models because (a) we have a sample that has about 74% zeroes and (b) because we are conceptualizing two processes occurring - one that predicts the likelihood of crossing the threshold into self-injurious behavior and one that predicts the number of times of engaging in the behavior. The Vuong test seems to indicate that the ZINB model is a better fit for the data than the NBReg model. > > Our question concerns if it is appropriate to use the ZINB because the response set of the variable capturing the number of times of engaging in SIB is not a straight count, but rather a "0 times" "1 time" "2-3 times" "4-5 times" "6-10 times" "11-20 times" "21-49 times" "50 or more times". We have recoded the variable into 0, 1, 2, 4, 6, 11, 21, 50 using the minimum in the category.but if we do that is using the ZINB model problematic? > > Thanks > Eugene * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Zero Inflated Negative Binomial model***From:*David Hoaglin <dchoaglin@gmail.com>

**References**:**st: Zero Inflated Negative Binomial model***From:*Eugene Walls <Eugene.Walls@du.edu>

- Prev by Date:
**Re: st: Error: factor variables and time-series operators not allowed r(101)** - Next by Date:
**Re: st: Error: factor variables and time-series operators not allowed r(101)** - Previous by thread:
**Re: st: Zero Inflated Negative Binomial model** - Next by thread:
**Re: st: Zero Inflated Negative Binomial model** - Index(es):