Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: which -cmp- option to use for poisson model with count data?

From   David Hoaglin <>
Subject   Re: st: which -cmp- option to use for poisson model with count data?
Date   Thu, 3 May 2012 09:30:09 -0400


That information on the dependent variable is a helpful start.  A
natural next question is what the frequency distribution of the counts
looks like.  When explanatory variables are involved, one can't
necessarily judge by looking only at that frequency distribution, but
it is a reasonable place to start.

If 0 is substantially more frequent than would be compatible with a
Poisson distribution (judging by the nonzero counts), the data may
come from a two-part process.  A person may decide whether to seek
advice from any expert; a logistic regression model might be
appropriate for that part.  Then, among people who decide to seek
advice, the number of experts varies (according to a Poisson model or
a negative binomial model).  These are examples of a type of two-part
model known as a hurdle model.

Another type of model is the zero-inflated Poisson or zero-inflated
negative binomial.  These are mixture models in which a count of 0 can
come either from deciding not to seek expert advice or from deciding
to seek expert advice but not actually consulting any experts (yet?).
In the corresponding hurdle models, the Poisson or negative binomial
distribution would be truncated at 0 (i.e., a count of 0 would not
come from the Poisson or the negative binomial).

I hope this helps.

David Hoaglin

> It's the number of experts a person has sought advice from, so I think
> there is no upper limit, like number of children.
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index