Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Ariel Linden, DrPH" <ariel.linden@gmail.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
re: RE: st: Zero Inflated Poisson Regression |

Date |
Wed, 8 Aug 2012 12:42:52 -0400 |

Hi Scott, I have been mulling over your posting, trying to think of a similar scenario in my discipline (health services research). In fact, this is similar to certain medical conditions that are age related. For example, asthma impacts young children and then goes ?dormant? for a few years and then reappears in the early 20?s to 30?s (after that it goes dormant again and reappears as chronic obstructive pulmonary disease later on in the late 40?s to early 50?s.). So imagine that we?d want to look at hospitalizations for asthma as the outcome. This is likely a Poisson-like distribution, and we?d need to account for the age issue described above. Exact matching on age (or age category) would seem to be a reasonable approach here, since we?d expect children undergoing an intervention to have fewer hospitalizations than children not undergoing the intervention (or perhaps lower probability of hospitalization). Those individuals in the middle age range where asthma is ?dormant? will not likely show any difference over controls in hospitalizations, but that may be a function of sample size/power. You could also consider stratification here, which may be a better approach with a bimodal or multi-modal distribution. You also correctly noted that you could use either splines or fractional polynomials. In the case of age, we?d expect this ?transformation? to account for the distributional ?hump? for children and then again in the twenties. I imagine that splines may fit better than fp. As for omitted variables, you?d have to answer that question based on your knowledge of the data and content expertise. Is there a reason to believe you?re missing important variables? Would you assume the results are biased? I would suggest that you run a sensitivity analysis after your analysis to determine the likelihood of unknown confounding biasing your results? I hope this helps Ariel Date: Tue, 7 Aug 2012 13:04:18 -0400 From: "Scott Holupka" <Scott.Holupka@jhu.edu> Subject: RE: st: Zero Inflated Poisson Regression Thanks for the suggestions. We've tried in the past to find an appropriate IV, but so far haven't found anything that works. At least with propensity we can try to control for any observed differences. Scott - -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Cameron McIntosh Sent: Monday, August 06, 2012 3:10 PM To: STATA LIST Subject: RE: st: Zero Inflated Poisson Regression Scott, Good question. Generally, I don't know if there is very much out there on how to fit ZIPs, or count/rate variable regression models in general, with non-linear relations (e.g., quadratic as you seem to suggest). I don't know what Stata has to offer in this regard (as I'm not a "Stata guy"), but I might suggest a neural network approach, perhaps using MATLAB: Nader, F., Hong, G., Kazem, M., Ali, S.S., Keramat, N., & Reza, E.M. (2009). Nonlinear Poisson regression using neural networks: a simulation study. Neural Computing & Applications, 18(8), 939-943. http://www.mscs.dal.ca/~hgu/Neural%20Comput%20&%20Applic.pdf As for your endogeneity problem, MATLAB also does propensity score matching, and you may also want to consider using instrumental variables, if you can find some good ones in your data set. Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of Economic Surveys, 22, 31-72. Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science, 25, 1-21. Bollen, K.A. (2012). Instrumental Variables in Sociology and the Social Sciences. Annual Review of Sociology, 38, 37-72. http://www.annualreviews.org/doi/abs/10.1146/annurev-soc-081309-150141?journ alCode=soc Perhaps some more experienced Stata programmers could provide you with a Stata solution, however. Anyway, hope this helps. Cam - ---------------------------------------- > From: Scott.Holupka@jhu.edu > To: statalist@hsphsun..harvard.edu > Subject: st: Zero Inflated Poisson Regression > Date: Mon, Aug 012 3::8::2 -400< > > This is mainly a question about running a zero-inflated poisson regression > using zip (Stata 0..)), but it's also a more general question of whether > Statalisters think I'm using the procedures in an appropriate way. > > My analysis is examining several expenditure categories. Typical of > expenditure data, the outcome variables are all skewed. Also typical is > that several outcomes have a large percentage (0%% to 0%%) of cases > reporting zero. I am therefore considering using zero-inflated poisson > models - zip - to examine these outcomes. > > Prior research also suggests that the relationship between our primary > independent variable - call it H - and expenditures will not be linear. In > particular, we expect spending may be lower at both high and low values of > H. I have previously used polynominal models to examine this relationship, > but I'm not sure if polynomials can be used with negative poissson models. > I am therefore also considering using a piecewise regression approach with > ZIP. > > Finally, I'm concerned about omitted variable bias since I don't have a > randomized sample. Again, in previous work I've used propensity score > methods to account for differences in observed characteristics. > > I know how to implement each of these methods in Stata, but I'm wondering if > it's appropriate to use all three methods at once. My current plan is to > run propensity analyses to identify similar groups based on observed > characteristics, then use those groups as covariates in a zero-inflated > poisson model that also include polynomial terms of H (e.g. H and > H-squared), or computing piecewise dummy variables of H. > > Any thoughts on whether this approach seems appropriate, particularly > whether ZIP can handle both the propensity covariates and polynomial terms, > would be appreciated. > > Thanks, > > Scott * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Quintiles** - Next by Date:
**st: RE: missing standard error in multinomial logit** - Previous by thread:
**RE: st: Zero Inflated Poisson Regression** - Next by thread:
**st: Generalized linear mixed model for repeated measures of nominal response variable** - Index(es):