Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Autocorrelated overdispersed panel count data, and no "xtzinb"...

From   Michael Mulcahy <>
Subject   st: Autocorrelated overdispersed panel count data, and no "xtzinb"...
Date   Wed, 30 Nov 2011 10:20:23 -0800 (PST)

Hi all,

Another user posted a similar question a while back, but I couldn't really extract much guidance from that exchange, so...

 have a count dependent variable and IV's measured annually on about 600 cities 
over 12 consecutive years (balanced panel). The dependent variable is 
over-dispersed and zero-inflated (about 40% of obs on dependent variable
 are zero's). Based on xtserial, there is evidence of autocorrelation. I
 know that stata doesn't offer anything like a "xtzinb" model (yet?). 

(My focal independent variable is a 5-category class variable indicating single years in the 5-year period that brackets the city government's decision on a piece of citizen-initiated legislation (0= no decis, 1=L2.decis, 2 = L.decis, 3 = decis, 4=f.decis, 5=f2.decis). Are counts of the dep. var. significantly different in run up to, in the year of, and/or in the wake of a decision?

I used a zinb model with L1.dv and L2.dv in both the "inflate" and the nbreg models, and clustered on city, exposure(year).
L1.dv and L2.dv are significant at p < .0001 in the "inflate" model, 
only L1. is significant in the nbreg model. 

One reviewer questions the use of a lagged dependent variable as potentially biasing coefficients downwards, citing Keele and Kelly 2001:  "Dynamic Models for Dynamic Theories: The Ins and Outs of Lagged Dependent Variables" Political Analysis (Spring 2006) 14 (2). Their argument doesn't really address count models explicitly.
For comparison, I also tried a hurdle model: the results for the central IV of interest are not substantially different. 

Is the zinb approach described above defensible? 
What other model comparisons should I do to help evaluate / shore up this approach? 
If the criticism is valid, what's the best alternative that addresses autocorrelation, zero-inflation and overdispersion? 

Any help is greatly appreciated!
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index