# RE: st: Zero-inflated binomial regression

 From "Steichen, Thomas J." <[email protected]> To <[email protected]> Subject RE: st: Zero-inflated binomial regression Date Tue, 23 Oct 2007 11:19:17 -0400

```There is also the zero-inflated negative binomial model (ZINB),
which allows overdispersion (a variance that exceeds the mean)
in the number of academic after-school programs.

Maarten's commentary below would apply to this model also
except that the "number of programs" woulds not need to be
determined through a strict poisson (variance = mean) process.

Tom

-----------------------------------
Thomas J. Steichen
[email protected]
-----------------------------------

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Maarten buis
Sent: Tuesday, October 23, 2007 9:39 AM
To: [email protected]
Subject: Re: st: Zero-inflated binomial regression

--- Meryle Weinstein <[email protected]> wrote:
> I have count data and have been doing analyses using negative
> binomial regression. I've been doing reading and think that the
> zero-inflated  binomial regression may be more appropriate given the
> number of zeros in data (243 out of 626).

1) I assume you mean zero inflated poisson (-zip- in Stata) instead of
zero-inflated binomial.

2) The negative binomial is also meant to deal with excessive zeros,
although it assumes these came into existence through a different
process.

> The data is the count of academic after-school programs in an
> elementary school zone.  The zones could have zero because
> they don't have any after-school programs (which is the majority of
> cases) or zero because there are no academic programs.   What
> I don't understand and haven't been able to find in the readings is
> how to choose the variables for inflate.

With -zip- you assume that there are two types of districts, a type of
district that will always have 0 programs, and a type of district
whereby the number of programs is determined through a poisson
regression (which may include 0 programs). You haven't observed the
type, but only the count and this is a mixture of the two processes.
The -inflate(varlist)- option tells -zip- which variables predict the
type of district. So you choose those variables you think will
influence the probability of being an "always zero program district".

For more on this I highly recomend "Regression Models for Categorical
Dependent Variables Using Stata" by J. Scott Long and Jeremy Freese.

http://www.stata.com/bookstore/regmodcdvs.html

Hope this helps,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

___________________________________________________________
Want ideas for reducing your carbon footprint? Visit Yahoo! For Good  http://uk.promotions.yahoo.com/forgood/environment.html
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

-----------------------------------------
CONFIDENTIALITY NOTE: This e-mail message, including any
attachment(s), contains information that may be confidential,
protected by the attorney-client or other legal privileges, and/or
proprietary non-public information. If you are not an intended
recipient of this message or an authorized assistant to an intended
then delete it from your system. Use, dissemination, distribution,
or reproduction of this message and/or any of its attachments (if
any) by unintended recipients is not authorized and may be
unlawful.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```