 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Model for Poisson-shaped distribution but with non-count data

 From Owen Gallupe To statalist@hsphsun2.harvard.edu Subject Re: st: Model for Poisson-shaped distribution but with non-count data Date Tue, 6 Dec 2011 12:48:35 -0800

```Thank you for the input, Cam, Paul, Paul, Nick, David, and Bill.

You have given me some very good options to consider.

Regarding Cam's earlier question, the multimodality only surfaces when
the DV is transformed using lnskew0. It is not an issue using the raw
version.

All the best.

Owen

On Tue, Dec 6, 2011 at 9:34 AM, William Gould, StataCorp LP
<wgould@stata.com> wrote:
> David Hoaglin <dchoaglin@gmail.com>, in reference to the blog entry
> "Use Poisson Rather Than Regress, Tell a Friend" at
>
>    http://blog.stata.com/2011/08/22/use-poisson-rather-than-regress-tell-a-friend/
>
>  wrote,
>
>> [..] I did not see any
>> mention of the fact that the Poisson distribution is discrete.  In the
>> limit (as the mean of the distribution becomes large), that matters
>> less, but one would need to view the possible data values as discrete.
>>
>> Some of the equations in the blog are not quite correct.  For example,
>> since Poisson regression is a form of generalized linear model, the
>> linear predictor is fitted to log(E(y)), rather than to log(y).  The
>> random component of the GLM is a Poisson distribution.
>
> I'm concerned that someone might interpret what David wrote to mean
>
>    1.  There may be practical problems using -poisson- to run
>        log-linear regressions, depending on whether the LHS variable
>        contains noninteger values.
>
>    2.  There may be theoretical problems using -poisson- to run
>        log-linear regressions.
>
> Neither would be true.  My short-and-quick response is,
>
>    1.  -poisson- can handle non-discrete (non-integer) data values.
>        Left-hand-side values do not have to be large to ammelorate any
>        problem.
>
>    2.  The formulas in the blog are as intended and are correct.
>
> Let me explain.
>
>
> Concerning #1, -poisson- does not round values when run on noninteger
> data.  Instead, it gives the warning message "you are responsible for
> interpreation of noncount dep. variable."
>
> An implication of that is that the objective function with non-integer
> data may not be a true likelihood function.  Actually, I suspect that
> it is, but that's irrelevant because we in the blog entry are doing M
> estimation and I recommended you obtain standard errors using the
> -vce(robust)- option.
>
> When -poisson- calculates the likelihood value associated with a
> noninteger value, it does that using the standard formulas, but
> substituting the Gamma function for factorial function.  That is
> appropriate for M estimation.
>
> This generalization means that you can run -poisson- using a LHS
> variable with noninteger values and there will be no problems.  All
> the values, in fact, can even be less than 1!  Whether you run on y,
> y/10, y/100, y/1000, ..., all that will change will be the intercept.
>
>
> Concerning #2, the formulas written in the blog entry imply that
> log(E(y)) = a + b*X.  It is true that I did write
>
>      y = exp(a + b*X + e)
>
> and that implies more than merely log(E(y)) = a + b*X.  I did that
> because I was starting with the log-linear regression problem.
> The purpose of the blog entry is to show that -poisson- could be
> used as an alternative to linear regression on the ln(y).
>
> -- Bill
> wgould@stata.com
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```