Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: From: Mikkel Høiberg <mikkelhoiberg@gmail.com>

 From owner-statalist@hsphsun2.harvard.edu To statalist@hsphsun2.harvard.edu Subject st: From: Mikkel Høiberg Date Fri, 26 Oct 2012 14:05:28 +0200

```Dear Stata listers,

I am trying to compare incidence rates for hip fractures across
several different published studies.
The incidence rates are defined by 5-year age strata (e.g. 40-44,
45-49 etc year of age, variablename Agegr) as well as gender.
Accessible values are number of fractures (Fracture) as well as cohort
populations in the age-groups (Totalpopulation)
Thus I have three defining variables: gender, agegroup and
study-number (study).

The scientific question is whether the hip fracture incidences vary
significantly across these studies, even though they propose to be
samples from the same population.

I have tried the following poisson regression i order to arrive at
incidence rate ratios.
As, in theory, no clustering should exist in these separate
incidences, I hope that this is not theoretically unsound.
I have used the following:

xi: poisson Fracture i.gender i.study i.Agegr, exp (Totalpopulation)
vce(robust) irr

Adding the three explanatory variables each increases the
goodness-of-fit of the model (reducing the chi2-value when running
"estat gof" after each model), and combining them gives an even better
fit. However, the chi squared test is still significant, thus the
model does not seem to be very good.

As fracture rates seem to increase exponentially with increasing age,
I have tried transforming the Agegroups to a squared factor
(Agegr2=Agegroup*Agegroup):

xi: poisson Fracture i.gender i.study i.Agegr2, exp (Totalpopulation)
vce(robust) irr

without any difference in the output.
I have also tried to run examinations with only two study populations
a piece, trying to optimize a fitted model, however, the chi squared
test in estat gof is still significant.
I am aware that the three explanatory variables are a long shot away
from explaining alle the variance, so maybe I am too optimistic as to
the effect of the fitted model.

By now, I still have not tried to illustrate the heterogeneity, e.g.
by the use of L'abbe's plot. I am unsure whether this would help me
on.

Should I be using a totally different way to try to test for
significant differences in incidence rates?
Or should I accept the significance of the chi squared test? And if so
- what would be the most ideal way of describing the statistical
output? By IRR for each separate study number?

Yours,

Mikkel Høiberg

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```