Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Binary model with many zeros and few ones

From   Nick Cox <>
Subject   Re: st: Binary model with many zeros and few ones
Date   Fri, 6 Jan 2012 11:33:36 +0000

Zero inflation as I understand it applies to situations in which there
is some kind of mixture of individuals who are zero for one reason and
individuals who are zero or one for another reason. For example, many
people never visit football matches and some may visit football
matches but just didn't do so during some survey period.  I don't
think your description here justifies that term. Some people might
want to describe your situation as one of  rare events and you might
want to Google "Gary King rare events logit". But that said, I would
certainly try -logit- or -probit- first.


On Fri, Jan 6, 2012 at 11:15 AM, Nikolaos Kanellopoulos
<> wrote:

> I have a dataset of around 880 thousand observations and I want to measure as accurately as possible the relationship between certain variables and an event described by a binary variable. My dependent variable has very few ones (around 1.5% of the observations).
> My question, and I apologize in advance if this has been asked in the Statalist before, which is the best way to analyse this “zero inflated” binary variable? Is it OK to use a simple probit or logit model? Any suggestions/references are more than welcome.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index