Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Zero Inflated Poisson Regression


From   "Scott Holupka" <Scott.Holupka@jhu.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Zero Inflated Poisson Regression
Date   Mon, 6 Aug 2012 13:48:32 -0400

This is mainly a question about running a zero-inflated poisson regression
using zip (Stata 10.1), but it's also a more general question of whether
Statalisters think I'm using the procedures in an appropriate way.

My analysis is examining several expenditure categories.  Typical of
expenditure data, the outcome variables are all skewed.  Also typical is
that several outcomes have a large percentage (20% to 40%) of cases
reporting zero.  I am therefore considering using zero-inflated poisson
models - zip - to examine these outcomes.  

Prior research also suggests that the relationship between our primary
independent variable - call it H - and expenditures will not be linear.  In
particular, we expect spending may be lower at both high and low values of
H.  I have previously used polynominal models to examine this relationship,
but I'm not sure if polynomials can be used with negative poissson models.
I am therefore also considering using a piecewise regression approach with
ZIP.

Finally, I'm concerned about omitted variable bias since I don't have a
randomized sample.  Again, in previous work I've used propensity score
methods to account for differences in observed characteristics.   

I know how to implement each of these methods in Stata, but I'm wondering if
it's appropriate to use all three methods at once.  My current plan is to
run propensity analyses to identify similar groups based on observed
characteristics, then use those groups as covariates in a zero-inflated
poisson model that also include polynomial terms of H (e.g. H and
H-squared), or computing piecewise dummy variables of H.   

Any thoughts on whether this approach seems appropriate, particularly
whether ZIP can handle both the propensity covariates and polynomial terms,
would be appreciated.

Thanks,

Scott



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index