»  Home »  Products »  Features »  Count variables »  Poisson regression

## Poisson regression

Stata’s poisson fits maximum-likelihood models of the number of occurrences (counts) of an event. In a Poisson regression model, the incidence rate for the jth observation is assumed to be given by

r_j = exp(b_0 + b_1*x_(1,j) + ... + b_k*x_(k,j)

If E_j is the exposure, the expected number of events C_j will be

C_j = E_j * r_j
= exp[ ln(E_j) + b_0 + b_1*x_(1,j) + ... + b_k*x_(k,j) ]

This is the model fitted by poisson. E_j may be specified or, if not specified, is assumed to be 1.

. poisson deaths smokes i.agecat, exposure(pyears) irr

Iteration 0:   log likelihood = -33.823284
Iteration 1:   log likelihood = -33.600471
Iteration 2:   log likelihood = -33.600153
Iteration 3:   log likelihood = -33.600153

Poisson regression                                Number of obs   =         10
LR chi2(5)      =     922.93
Prob > chi2     =     0.0000
Log likelihood = -33.600153                       Pseudo R2       =     0.9321

 deaths IRR Std. Err. z P>|z| [95% Conf. Interval] smokes 1.425519 .1530638 3.30 0.001 1.154984 1.759421 agecat 45-54 4.410584 .8605197 7.61 0.000 3.009011 6.464997 55-64 13.8392 2.542638 14.30 0.000 9.654328 19.83809 65-74 28.51678 5.269878 18.13 0.000 19.85177 40.96395 75-84 40.45121 7.775511 19.25 0.000 27.75326 58.95885 _cons .0003636 .0000697 -41.30 0.000 .0002497 .0005296 ln(pyears) 1 (exposure)

The syntax of all estimation commands is the same: the name of the dependent variable is followed by the names of the independent variables, which are followed by a comma and any options. In this case, we controlled for the exposure (person-years recorded in the variable pyears) and asked that results be displayed as incidence-rate ratios rather than as coefficients.

svy: poisson can be used to analyze complex survey data, and the mi estimate: poisson command performs estimation using multiple imputations. Also, Stata provides Cox regression, exponential, Weibull, and other parametric survival models, as well as logistic regression, and all can be used to analyze complex survey data or to perform estimation using multiple imputations.