help expoisson dialog: expoisson
also see: expoisson postestimation
-------------------------------------------------------------------------------
Title
[R] expoisson -- Exact Poisson regression
Syntax
expoisson depvar indepvars [if] [in] [weight] [, options]
options description
-------------------------------------------------------------------------
Model
condvars(varlist) condition on variables in varlist
group(varname) groups/strata are stratified by unique values of
varname
exposure(varname_e) include ln(varname_e) in model with coefficient
constrained to 1
offset(varname_o) include varname_o in model with coefficient
constrained to 1
Options
memory(#[b|k|m|g]) set limit on memory usage; default is memory(25m)
saving(filename) save the joint conditional distribution to filename
Reporting
level(#) set confidence level; default is level(95)
irr report incidence-rate ratios
test(testopt) report significance of observed sufficient
statistic, conditional scores test, or
conditional probabilities test
mue(varlist) compute the median unbiased estimates for varlist
midp use the mid-p-value rule
nolog do not display the enumeration log
-------------------------------------------------------------------------
by, statsby, and xi are allowed; see prefix.
fweights are allowed; see weight.
See [R] expoisson postestimation for features available after estimation.
Menu
Statistics > Exact statistics > Exact Poisson regression
Description
expoisson fits an exact Poisson regression model of depvar on indepvars.
Exact Poisson regression is an alternative to standard
maximum-likelihood-based Poisson regression (see [R] poisson) that offers
more accurate inference in small samples because it does not depend on
asymptotic results. For stratified data, expoisson is an alternative to
fixed-effects Poisson regression (see xtpoisson, fe in [XT] xtpoisson);
like fixed-effects Poisson regression, exact Poisson regression
conditions on the number of events in each stratum.
Exact Poisson regression is computationally intensive, so if you have
regressors whose parameter estimates are not of interest (i.e., nuisance
parameters), you should specify those variables in the condvars() option
instead of in indepvars.
Options
+-------+
----+ Model +------------------------------------------------------------
condvars(varlist) specifies variables whose parameter estimates are not
of interest to you. You can save substantial computer time and
memory by moving such variables from indepvars to condvars().
Understand that you will get the same results for x1 and x3 whether
you type
. expoisson y x1 x2 x3 x4
or
. expoisson y x1 x3, condvars(x2 x4)
group(varname) specifies the variable defining the strata, if any. A
constant term is assumed for each stratum identified in varname, and
the sufficient statistics for indepvars are conditioned on the
observed number of successed within each group (as well as other
variables in the model). The group variable must be integer valued.
exposure(varname_e), offset(varname_o); see [R] estimation options.
+---------+
----+ Options +----------------------------------------------------------
memory(#[b|k|m|g]) sets a limit on the amount of memory expoisson can use
when computing the conditional distribution of the parameter
sufficient statistics. The default is memory(25m), where m stands
for megabyte, or 1,048,576 bytes. The following are also available:
b stands for byte; k stands for kilobyte, which is equal to 1,024
bytes; and g stands for gigabyte, which is equal to 1,024 megabytes.
The minimum setting allowed is 1m and the maximum is 2048m or 2g, but
do not attempt to use more memory than is available on your computer.
saving(filename[, replace]) saves the joint conditional distribution for
each independent variable specified in indepvars. There is one file
for each variable and it is named using the prefix filename with the
variable name appended. For example, saving(mydata) with an
independent variable named X would generate a data file named
mydata_X.dta. Use replace to replace an existing file. Each file
contains the conditional distribution for one of the independent
variables specified in indepvars conditioned on all other indepvars
and those variables specified in condvars(). There are two variables
in each data file: the feasible sufficient statistics for the
variable's parameter and their associated weights. The weights
variable is named _w_.
+-----------+
----+ Reporting +--------------------------------------------------------
level(#); see [R] estimation options. The level(#) option will not work
on replay because confidence intervals are based on
estimator-specific enumerations. To change the confidence level, you
must refit the model.
irr reports estimated coefficients transformed to incidence-rate ratios,
that is, exp(b) rather than b. Standard errors and confidence
intervals are similarly transformed. This option affects how results
are displayed, not how they are estimated or stored. irr may be
specified at estimation or when replaying previously estimated
results.
test(sufficient|score|probability) reports the significance level of the
observed sufficient statistic, the conditional scores test, or the
conditional probabilities test. The default is test(sufficient).
All the statistics are computed at estimation time, and each
statistic may be displayed postestimation; see [R] expoisson
postestimation.
mue(varlist) specifies that median unbiased estimates (MUEs) be reported
for the variables in varlist. By default, the conditional maximum
likelihood estimates (CMLEs) are reported, except for those
parameters for which the CMLEs are infinite. Specify mue(_all) if
you want MUEs for all the indepvars.
midp instructs expoisson to use the mid-p-value rule when computing the
MUEs, significance levels, and confidence intervals. This adjustment
is for the discreteness of the distribution by halving the value of
the discrete probability of the observed statistic before adding it
to the p-value. The mid-p-value rule cannot be applied to MUEs whose
corresponding parameter CMLE is infinite.
nolog prevents the display of the enumeration log. By default, the
enumeration log is displayed, showing the progress of computing the
conditional distribution of the sufficient statistics.
Technical note
The option memory(#) limits the amount of memory that expoisson will
consume when computing the conditional distribution of the parameter
sufficient statistics. memory() is independent of the system setting
c(memory) (see set memory in [D] memory), and expoisson can exceed the
memory limit specified in c(memory) without terminating. By default, a
log is provided that displays the number of enumerations (the size of the
conditional distribution) after processing each observation. Typically,
you will see the number of enumerations increase, and then at some point
they will decrease as the multivariate shift algorithm determines that
some of the enumerations cannot achieve the observed sufficient
statistics of the conditioning variables. When the algorithm is
complete, however, it is necessary to store the conditional distribution
of the parameter sufficient statistics as a dataset. It is possible,
therefore, to get a memory error when the algorithm has completed and
c(memory) is not large enough to store the conditional distribution.
Examples
Setup
. webuse smokes
Perform exact Poisson regression of cases on smokes using exposure peryrs
. expoisson cases smokes, exposure(peryrs) irr
Replay results and report conditional scores test
. expoisson, test(score) irr
Saved results
expoisson saves the following in e():
Scalars
e(N) number of observations
e(k_groups) number of groups
e(relative_weight) relative weight for the observed e(sufficient)
and e(condvars)
e(sum_y) sum of depvar
e(k_indvars) number of independent variables
e(k_condvars) number of conditioning variables
e(midp) mid-p-value rule indicator
e(eps) relative difference tolerance
Macros
e(cmd) expoisson
e(cmdline) command as typed
e(title) Exact Poisson regression
e(depvar) dependent variable
e(indvars) independent variables
e(condvars) conditional variables
e(groupvar) group variable
e(exposure) exposure variable
e(level) confidence level
e(wtype) weight type
e(wexp) weight expression
e(datasignature) the checksum
e(datasignaturevars) variables used in calculation of checksum
e(properties) b V
e(estat_cmd) program used to implement estat
e(predict) program used to implement predict
e(marginsnotok) predictions disallowed by margins
Matrices
e(b) coefficient vector
e(mue_indicators) indicator for elements of e(b) estimated using
MUE instead of CMLE
e(se) e(b) standard errors (CMLEs only)
e(ci) matrix of e(level) confidence intervals for e(b)
e(sum_y_groups) sum of e(depvar) for each group
e(N_g) number of observations in each group
e(sufficient) sufficient statistics for e(b)
e(p_sufficient) p-value for e(sufficient)
e(scoretest) conditional scores tests for indepvars
e(p_scoretest) p-value for e(scoretest)
e(probtest) conditional probability tests for indepvars
e(p_probtest) p-value for e(probtest)
Function
e(sample) marks estimation sample
Also see
Manual: [R] expoisson
Help: [R] expoisson postestimation;
[R] poisson, [XT] xtpoisson