# st: Proper standard error for a difference in rates for a Poisson distributedcount?

 From David Harless <[email protected]> To [email protected] Subject st: Proper standard error for a difference in rates for a Poisson distributedcount? Date Mon, 18 Oct 2004 11:55:20 -0400

Dear Statalisters:
I am writing to ask for advice on how to calculate (whether there exists?) a proper standard error for a mean *difference* in rates from a Poisson distributed count. Suggestions or citations would be greatly appreciated.

Here are the details:
I have counts of the number of failures by groups over time. In each period, there is also a score (from 3 to 5) which is claimed as an (ordinal) rating of safety of the group in the time period. Example:

. list group period score expos failures in 17/23, sepby(group) noobs
+----------------------------------------------+
| group period score expos failures |
|----------------------------------------------|
| 7 1 4 10.0092 17 |
| 7 2 5 15.3863 11 |
|----------------------------------------------|
| 8 1 5 143.2803 93 |
| 8 2 4 32.9646 24 |
| 8 3 3 31.2676 21 |
| 8 4 4 89.2825 63 |
| 8 5 3 26.6402 6 |
+----------------------------------------------+

My main analysis is poisson regression with fixed effects for the groups (xtpoisson, fe) that would include dummy variables for the scores. But I would also like to provide a descriptive table that illustrated the change in failure rate as the score changed -- and have that table include a correct standard error.

My problem would be straightforward if I simply wanted to present mean failure rates by score:
. gen rate=failures/expos
. table score [aweight=expos] , c(m rate) format(%9.3f)
----------------------
mean(rate)
----------+-----------
3 | 0.964
4 | 0.847
5 | 0.875
----------------------

And the proper standard errors may be obtained using -ci- :

. sort score
. by score: ci failures, exposure(expos) poisson
-> score = 3
-- Poisson Exact --
Variable | Exposure Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
failures | 4684 .9637415 .0143443 .93583 .9922742

And so on.

But I want my table to show the means of the *differences* in failure rates (by group) for different scores. Along the lines of:

. tsset group period
panel variable: group, 1 to 153
time variable: period, 1 to 7

. gen d_rate=d.rate
(153 missing values generated)

. gen l_score=l.score
(153 missing values generated)

. gen sum_expos=expos+l.expos
(153 missing values generated)

. table l_score score [aweight=sum_expos], c(m d_rate ) f(%9.3f)
----------------------------------
l_score | 3 4 5
----------+-----------------------
3 | -0.029 -0.069 -0.074
4 | -0.026 -0.009 -0.038
5 | 0.135 -0.025
----------------------------------

I've already made an assumption in weighting by the sum of exposures over the two periods in which the difference in rates are calculated. Perhaps there is a better way to incorporate this information, but the mean of differences in rates must be calculated in a way that reflects different exposures.

So my question is: Can one calculate a proper standard error for a difference in rates for a Poisson distributed count, as in the above example?

Thanks,
Dave Harless
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

• Follow-Ups: