This question was originally posed on Statalist.

Title | Stata 5: Goodness-of-fit chi-squared test reported by poisson | |

Author | Bill Sribney, StataCorp |

The version 5 documentation indicates the goodness-of-fit chi-squared statistic reported with the results of Poisson regression is a test of the null hypothesis that the dependent variable is Poisson distributed. My question is why this statistic (and perhaps the resulting inference regarding the appropriateness of Poisson regression) varies with the composition of the right-hand-side variables.

The goodness-of-fit chi-squared statistic in the **poisson** command is a
simple Pearson's chi-squared statistic:

N Sum (observed - expected)^{2}/expected i=1

where **i** indexes the observations in the dataset. The **df** is

df = N - (#terms in model including the constant)

If you split up or group the counts and exposures differently, you get different cells for the Pearson's chi-squared and thus a different statistic.

Here’s an example using the first example in the poisson entry of the manual on page 31 of the P–Z Reference manual:

. listairline injuries n XYZowned 1. 1 11 0.0950 1 2. 2 7 0.1920 0 3. 3 7 0.0750 0 4. 4 19 0.2078 0 5. 5 9 0.1382 0 6. 6 4 0.0540 1 7. 7 3 0.1292 0 8. 8 1 0.0503 0 9. 9 3 0.0629 1. poisson injuries XYZowned, exposure(n) irrIteration 0: Log Likelihood = -23.90184 Iteration 1: Log Likelihood = -23.032242 Iteration 2: Log Likelihood = -23.027176 Poisson regression, normalized by n Number of obs = 9 Goodness-of-fit chi2(7) = 14.094 Model chi2(1) = 1.768 Prob > chi2 = 0.0495 Prob > chi2 = 0.1836 Log Likelihood = -23.027 Pseudo R2 = 0.0370 ------------------------------------------------------------------------------ injuries | IRR Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- XYZowned | 1.463467 .406872 1.370 0.171 .8486578 2.523675 ------------------------------------------------------------------------------

Now we will group the data by the unique covariate patterns of the model. In this case that simply amounts to grouping by XYZowned and summing counts (injuries) and exposure (n) within this grouping:

. collapse (sum) injuries n, by(XYZowned) . listXYZowned injuries n 1. 0 46 .7925 2. 1 18 .2119. poisson injuries XYZowned, exposure(n) irrIteration 0: Log Likelihood = -5.2133484 Iteration 1: Log Likelihood = -5.2038269 Poisson regression, normalized by n Number of obs = 2 Goodness-of-fit chi2(0) = 0.000 Model chi2(1) = 1.768 Prob > chi2 = . Prob > chi2 = 0.1836 Log Likelihood = -5.204 Pseudo R2 = 0.1452 ------------------------------------------------------------------------------ injuries | IRR Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- XYZowned | 1.463466 .4068718 1.370 0.171 .8486574 2.523673 ------------------------------------------------------------------------------

Note that the IRR and std error are the same, but the goodness-of-fit test is different. From the standpoint of the Poisson regression, both the original and collapsed datasets are equivalent, but the first dataset has more information about the Poisson-ness of the data since you can examine the counts for small portions of exposure.

When the portions of exposure get too small, one gets the well-known problem of the expected counts for the Pearson chi-squared becoming small.

Perhaps Stata should automatically group by covariate pattern before doing
the Pearson's chi-squared as **lfit** does after **logistic**. But in
some cases, it is certainly legitimate NOT to group (this one is close to
being one of these cases — injuries are just a little too low for some
obs).

Note that Pearson’s chi-squared also has a problem when its df become
large. This happens for **poisson** when the number of observation
becomes large.

My personal rules of thumb:

- If the number of unique covariate patterns is not small (say greater than 20), then group on it for the gof test so that your dataset has only one observation per unique covariate pattern.
- Look at predicted (expected) counts. If there are any very small ones (< 2) or lots of small ones (< 5), view Pearson's chi-squared gof test with suspicion.
- If the df of the chi-squared is large (>50-100), take the result with a large grain of salt. (This is true for any chi-squared statistic.)