Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Joerg Luedicke <joerg.luedicke@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: stset for grouped data |

Date |
Thu, 14 Apr 2011 11:04:36 -0400 |

On Thu, Apr 14, 2011 at 6:39 AM, Dherani, Mukesh <M.K.Dherani@liverpool.ac.uk> wrote: > Thanks. Yes it is aggregated data. What actually I want to do is to calculate the cumulative incidence rate by country. > Assuming that there is no censoring, and on average each person in a particular age group has same age of onset, can't we calculate incidence rate? We have population by age group hence we may be able to calculate total person years of follow-up. I may be naive here, I thought if I expanded data based on > expand cases > I will get data for an individual and using above assumptions I may be able to calculate the incidence rate. Maybe I am missing something here, but I would think that your incidents have constant exposure ("no censoring", "same age of onset") and that is not changing if you merely inflate your data. You should be able to get the incidents rate from the data you have, for example, by simply dividing the "cases" by "population" (for each age group). If you want to "test" something and need a model you could run, for example, a Poisson regression with (logged) population as offset. If we take your example data and run the Poisson model: . input region year agegp cases population region year agegp cases populat~n 1. 1 1994 4 2 5000 2. 1 1994 9 5 2548 3. 1 1994 14 6 2547 4. 1 1994 19 15 7521 5. 1 1994 24 75 7896 6. end . gen logpop=log( population) . list +-----------------------------------------------------+ region year agegp cases popula~n logpop ----------------------------------------------------- 1. 1 1994 4 2 5000 8.517193 2. 1 1994 9 5 2548 7.843064 3. 1 1994 14 6 2547 7.842671 4. 1 1994 19 15 7521 8.925454 5. 1 1994 24 75 7896 8.974112 +-----------------------------------------------------+ . poisson cases i.agegp, offset( logpop) irr Iteration 0: log likelihood = -19.89126 Iteration 1: log likelihood = -10.835984 Iteration 2: log likelihood = -10.235966 Iteration 3: log likelihood = -10.233162 Iteration 4: log likelihood = -10.233161 Poisson regression Number of obs = 5 LR chi2(4) = 84.25 Prob > chi2 = 0.0000 Log likelihood = -10.233161 Pseudo R2 = 0.8046 cases IRR Std. Err. z P>z [95% Conf. Interval] agegp 9 4.905808 4.104493 1.90 0.057 .9517967 25.28581 14 5.88928 4.808577 2.17 0.030 1.188664 29.17866 19 4.986039 3.753353 2.13 0.033 1.140235 21.80303 24 23.74619 17.0135 4.42 0.000 5.830841 96.70675 logpop (offset) we see, for instance that the incidence rate in the oldest age group is roughly 24 times as high as in the youngest age group. However, there still may be better solutions to your problem. J. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: stset for grouped data***From:*"Dherani, Mukesh" <M.K.Dherani@liverpool.ac.uk>

**References**:**st: stset for grouped data***From:*"Dherani, Mukesh" <M.K.Dherani@liverpool.ac.uk>

**Re: st: stset for grouped data***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

**RE: st: stset for grouped data***From:*"Dherani, Mukesh" <M.K.Dherani@liverpool.ac.uk>

- Prev by Date:
**Re: st: Rectifying y-axis labels using a tiny .scheme** - Next by Date:
**RE: st: add up variable / quantile** - Previous by thread:
**RE: st: stset for grouped data** - Next by thread:
**RE: st: stset for grouped data** - Index(es):