Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten buis <maartenbuis@yahoo.co.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Poisson Regression |

Date |
Mon, 14 Feb 2011 09:20:23 +0000 (GMT) |

--- On Sun, 13/2/11, Alexandra Boing wrote: > I would like to know how to proceed and the justication > Mathematical and Statistical. My dependent variable is > spent on health (0=No 1=Yes). The prevalence was higher > than 10 percent. Can I do Poisson regression? According > to this paper published in BMC on line in 2003, registred > PMC521200 I can do Poisson regression with variable (0=No > 1=Yes) and with prevalence higher than 10 percent, but > other authors report that only I can do Poisson regression > with the dependent variable= discrete variable and > prevalence under 10 percent. Which is correct? And what is > the explanation Mathematical and Statistical? I agree with Carlo that you need to give a more complete reference to the article you just refered to. The -poisson- model for binary variables is used when one wants to interpret coeficients as risk ratios. The problem is that when the prevalence is high, the predicted risks can easily become higher than 1. Even if the predicted risk remain less than 1, but are still high, the relationship between a continuous explanatory variable and your outcome variable can have a shape that is just too unrealistic. The 10 percent strikes me as a reasonable "rule of thumb", but there is no such thing as a "correct rule of thumb", they are always approximate. I would use -adjust- to get adjusted predictions, and set the other covariates at such values that the predicted probability will be as high as possible and plot the resulting curves. If the curve still look reasonable, then there is probably no problem. It may also help to plot the curve from a -logit- regression, which would be the obvious alternative when -poisson- leads to unrealistic predictions. *--------------------- begin example ------------------------- sysuse nlsw88, clear gen byte highocc = occupation < 3 if !missing(occupation) gen byte black = race == 2 if race <=2 poisson union south grade highocc black adjust south=1 highocc=0 black=1, by(grade) exp gen(pr_poiss) logit union south grade highocc black adjust south=1 highocc=0 black=1, by(grade) pr gen(pr_logit) twoway line pr* grade, sort /// ytitle("predicted probability") /// legend(order( 1 "poisson" /// 2 "logit" )) *---------------------- end example --------------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Poisson Regression***From:*Alexandra Boing <alexandraboing@yahoo.com.br>

- Prev by Date:
**st: Three simulateneous equations - one being logit** - Next by Date:
**Re: st: generating a word table including the first stage regression(s) with an iv-regression** - Previous by thread:
**st: R: Poisson Regression** - Next by thread:
**st: RE: Poisson Regression** - Index(es):