Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"William Gould, StataCorp LP" <wgould@stata.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Model for Poisson-shaped distribution but with non-count data |

Date |
Tue, 06 Dec 2011 11:34:13 -0600 |

David Hoaglin <dchoaglin@gmail.com>, in reference to the blog entry "Use Poisson Rather Than Regress, Tell a Friend" at http://blog.stata.com/2011/08/22/use-poisson-rather-than-regress-tell-a-friend/ wrote, > [..] I did not see any > mention of the fact that the Poisson distribution is discrete. In the > limit (as the mean of the distribution becomes large), that matters > less, but one would need to view the possible data values as discrete. > > Some of the equations in the blog are not quite correct. For example, > since Poisson regression is a form of generalized linear model, the > linear predictor is fitted to log(E(y)), rather than to log(y). The > random component of the GLM is a Poisson distribution. I'm concerned that someone might interpret what David wrote to mean 1. There may be practical problems using -poisson- to run log-linear regressions, depending on whether the LHS variable contains noninteger values. 2. There may be theoretical problems using -poisson- to run log-linear regressions. Neither would be true. My short-and-quick response is, 1. -poisson- can handle non-discrete (non-integer) data values. Left-hand-side values do not have to be large to ammelorate any problem. 2. The formulas in the blog are as intended and are correct. Let me explain. Concerning #1, -poisson- does not round values when run on noninteger data. Instead, it gives the warning message "you are responsible for interpreation of noncount dep. variable." An implication of that is that the objective function with non-integer data may not be a true likelihood function. Actually, I suspect that it is, but that's irrelevant because we in the blog entry are doing M estimation and I recommended you obtain standard errors using the -vce(robust)- option. When -poisson- calculates the likelihood value associated with a noninteger value, it does that using the standard formulas, but substituting the Gamma function for factorial function. That is appropriate for M estimation. This generalization means that you can run -poisson- using a LHS variable with noninteger values and there will be no problems. All the values, in fact, can even be less than 1! Whether you run on y, y/10, y/100, y/1000, ..., all that will change will be the intercept. Concerning #2, the formulas written in the blog entry imply that log(E(y)) = a + b*X. It is true that I did write y = exp(a + b*X + e) and that implies more than merely log(E(y)) = a + b*X. I did that because I was starting with the log-linear regression problem. The purpose of the blog entry is to show that -poisson- could be used as an alternative to linear regression on the ln(y). -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Model for Poisson-shaped distribution but with non-count data***From:*Owen Gallupe <ogallupe@gmail.com>

- Prev by Date:
**Re: st: St: Logistic regression & standardized coefficients; Multinomial GOF test** - Next by Date:
**RE: st: Calculating margins after biprobit** - Previous by thread:
**RE: st: Model for Poisson-shaped distribution but with non-count data** - Next by thread:
**Re: st: Model for Poisson-shaped distribution but with non-count data** - Index(es):