st: Which regression model to use for zero-inflated, non-normal outcome? |

Fri, 2 Oct 2009 18:30:39 -0700 (PDT) |

Hi, I'm trying to run a regression model to identify independent predcitors of a specific continuous outcome (independent variable). (1) The outcome is non-normal (swilk p-value 0.0000), so I can't use a linear regression model. (2) There are a number of patients where the outcome value is zero (approximately 30% of the cohort). So I can't direct use a log linear model because automatically patients in whom the outcome is zero have a non-calculable log(outcome) and are dropped from the analysis. One option would be that i have nominal value for those with zero, i.e. add 0.5 to all patients so that the outcome is not zero. (3) Even if the outcome is a count variable (incidence), the variance is much >>> the mean, and the Poisson goodness of fit has a p of 0.000. (4) Negative binomial model has a better fit, but does the high number of zeros raise any concern? (5) I also tried zero inflated negative binomial regression, but all the examples I've seen are where one of the independent variables has a high number of zeros. Is it appropriate to use the zinb command when the dependent variable has a high number of zeros? Thanks, Ashwin * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

