[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Zero-inflated binomial regression

From   "Steichen, Thomas J." <>
To   <>
Subject   RE: st: Zero-inflated binomial regression
Date   Tue, 23 Oct 2007 11:19:17 -0400

There is also the zero-inflated negative binomial model (ZINB), 
which allows overdispersion (a variance that exceeds the mean)
in the number of academic after-school programs.

Maarten's commentary below would apply to this model also
except that the "number of programs" woulds not need to be
determined through a strict poisson (variance = mean) process.


Thomas J. Steichen
-----Original Message-----
From: [] On Behalf Of Maarten buis
Sent: Tuesday, October 23, 2007 9:39 AM
Subject: Re: st: Zero-inflated binomial regression

--- Meryle Weinstein <> wrote:
> I have count data and have been doing analyses using negative
> binomial regression. I've been doing reading and think that the
> zero-inflated  binomial regression may be more appropriate given the
> number of zeros in data (243 out of 626). 

Two comments: 
1) I assume you mean zero inflated poisson (-zip- in Stata) instead of
zero-inflated binomial.

2) The negative binomial is also meant to deal with excessive zeros,
although it assumes these came into existence through a different

> The data is the count of academic after-school programs in an
> elementary school zone.  The zones could have zero because
> they don't have any after-school programs (which is the majority of
> cases) or zero because there are no academic programs.   What 
> I don't understand and haven't been able to find in the readings is
> how to choose the variables for inflate.  

With -zip- you assume that there are two types of districts, a type of
district that will always have 0 programs, and a type of district
whereby the number of programs is determined through a poisson
regression (which may include 0 programs). You haven't observed the
type, but only the count and this is a mixture of the two processes.
The -inflate(varlist)- option tells -zip- which variables predict the
type of district. So you choose those variables you think will
influence the probability of being an "always zero program district".

For more on this I highly recomend "Regression Models for Categorical
Dependent Variables Using Stata" by J. Scott Long and Jeremy Freese.

Hope this helps,

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

Want ideas for reducing your carbon footprint? Visit Yahoo! For Good
*   For searches and help try:

CONFIDENTIALITY NOTE: This e-mail message, including any
attachment(s), contains information that may be confidential,
protected by the attorney-client or other legal privileges, and/or
proprietary non-public information. If you are not an intended
recipient of this message or an authorized assistant to an intended
recipient, please notify the sender by replying to this message and
then delete it from your system. Use, dissemination, distribution,
or reproduction of this message and/or any of its attachments (if
any) by unintended recipients is not authorized and may be

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index