**[R] nbreg** -- Negative binomial regression

__Syntax__

Negative binomial regression model

**nbreg** *depvar* [*indepvars*] [*if*] [*in*] [*weight*] [**,** *nbreg_options*]

Generalized negative binomial model

**gnbreg** *depvar* [*indepvars*] [*if*] [*in*] [*weight*] [**,** *gnbreg_options*]

*nbreg_options* Description
-------------------------------------------------------------------------
Model
__nocons__**tant** suppress constant term
__d__**ispersion(**__m__**ean)** parameterization of dispersion; the default
__d__**ispersion(**__c__**onstant)** constant dispersion for all observations
__exp__**osure(***varname_e***)** include ln(*varname_e*) in model with
coefficient constrained to 1
__off__**set(***varname_o***)** include *varname_o* in model with coefficient
constrained to 1
__const__**raints(***constraints***)** apply specified linear constraints
__col__**linear** keep collinear variables

SE/Robust
**vce(***vcetype***)** *vcetype* may be **oim**, __r__**obust**, __cl__**uster**
*clustvar*, **opg**, __boot__**strap**, or __jack__**knife**

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
__nolr__**test** suppress likelihood-ratio test
__ir__**r** report incidence-rate ratios
__nocnsr__**eport** do not display constraints
*display_options* control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling

Maximization
*maximize_options* control the maximization process; seldom
used

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------

*gnbreg_options* Description
-------------------------------------------------------------------------
Model
__nocons__**tant** suppress constant term
__lna__**lpha(***varlist***)** dispersion model variables
__exp__**osure(***varname_e***)** include ln(*varname_e*) in model with
coefficient constrained to 1
__off__**set(***varname_o***)** include *varname_o* in model with coefficient
constrained to 1
__const__**raints(***constraints***)** apply specified linear constraints
__col__**linear** keep collinear variables

SE/Robust
**vce(***vcetype***)** *vcetype* may be **oim**, __r__**obust**, __cl__**uster**
*clustvar*, **opg**, __boot__**strap**, or __jack__**knife**

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
__ir__**r** report incidence-rate ratios
__nocnsr__**eport** do not display constraints
*display_options* control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling
Maximization
*maximize_options* control the maximization process; seldom
used

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------

*indepvars* and *varlist* may contain factor variables; see fvvarlist.
*depvar*, *indepvars*, *varname_e*, and *varname_o* may contain time-series
operators (**nbreg** only); see tsvarlist.
**bayes**, **bootstrap**, **by** (**nbreg** only), **fmm** (**nbreg** only), **fp** (**nbreg** only),
**jackknife**, **mfp** (**nbreg** only), **mi estimate**, **nestreg** (**nbreg** only),
**rolling**, **statsby**, **stepwise**, and **svy** are allowed; see prefix. For more
details, see **[BAYES] bayes: gnbreg**, **[BAYES] bayes: nbreg**, and **[FMM]**
**fmm: nbreg**.
**vce(bootstrap)** and **vce(jackknife)** are not allowed with the **mi estimate**
prefix.
Weights are not allowed with the **bootstrap** prefix.
**vce()** and weights are not allowed with the **svy** prefix.
**fweight**s, **iweight**s, and **pweight**s are allowed; see weight.
**coeflegend** does not appear in the dialog box.
See **[R] nbreg postestimation** for features available after estimation.

__Menu__

__nbreg__

**Statistics > Count outcomes > Negative binomial regression**

__gnbreg__

**Statistics > Count outcomes > Generalized negative binomial**
**regression**

__Description__

**nbreg** fits a negative binomial regression model for a nonnegative count
dependent variable. In this model, the count variable is believed to be
generated by a Poisson-like process, except that the variation is allowed
to be greater than that of a true Poisson. This extra variation is
referred to as overdispersion.

**gnbreg** fits a generalization of the negative binomial mean-dispersion
model; the shape parameter alpha may also be parameterized.

__Options for nbreg__

+-------+
----+ Model +------------------------------------------------------------

**noconstant**; see **[R] estimation options**.

**dispersion(mean**|**constant)** specifies the parameterization of the model.
**dispersion(mean)**, the default, yields a model with dispersion equal
to 1 + alpha*exp(xb + offset); that is, the dispersion is a function
of the expected mean: exp(xb + offset). **dispersion(constant)** has
dispersion equal to 1 + delta; that is, it is a constant for all
observations.

**exposure(***varname_e***)**, **offset(***varname_o***)**, **constraints(***constraints***)**,
**collinear**; see **[R] estimation options**.

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype***)** specifies the type of standard error reported, which
includes types that are derived from asymptotic theory (**oim**, **opg**),
that are robust to some kinds of misspecification (**robust**), that
allow for intragroup correlation (**cluster** *clustvar*), and that use
bootstrap or jackknife methods (**bootstrap**, **jackknife**); see **[R]**
*vce_option*.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

**nolrtest** suppresses fitting the Poisson model. Without this option, a
comparison Poisson model is fit, and the likelihood is used in a
likelihood-ratio test of the null hypothesis that the dispersion
parameter is zero.

**irr** reports estimated coefficients transformed to incidence-rate ratios,
that is, exp(b) rather than b. Standard errors and confidence
intervals are similarly transformed. This option affects how results
are displayed, not how they are estimated or stored. **irr** may be
specified at estimation or when replaying previously estimated
results.

**nocnsreport**; see **[R] estimation options**.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

+--------------+
----+ Maximization +-----------------------------------------------------

*maximize_options*: __dif__**ficult**, __tech__**nique(***algorithm_spec***)**, __iter__**ate(***#***)**,
[__no__]__lo__**g**, __tr__**ace**, __grad__**ient**, **showstep**, __hess__**ian**, __showtol__**erance**,
__tol__**erance(***#***)**, __ltol__**erance(***#***)**, __nrtol__**erance(***#***)**, __nonrtol__**erance**, and
**from(***init_specs***)**; see **[R] maximize**. These options are seldom used.

Setting the optimization type to **technique(bhhh)** resets the default
*vcetype* to **vce(opg)**.

The following option is available with **nbreg** but is not shown in the
dialog box:

**coeflegend**; see **[R] estimation options**.

__Options for gnbreg__

+--------+
----+ Model +-----------------------------------------------------------

**noconstant**; see **[R] estimation options**.

**lnalpha(***varlist***)** allows you to specify a linear equation for ln(alpha).
Specifying **lnalpha(male old)** means that ln(alpha)=a_0 + a_1**male** +
a_2**old**, where a_0, a_1, and a_2 are parameters to be estimated along
with the other model coefficients. If this option is not specified,
**gnbreg** and **nbreg** will produce the same results because the shape
parameter will be parameterized as a constant.

**exposure(***varname_e***)**, **offset(***varname_o***)**, **constraints(***constraints***)**, and
**collinear**; see **[R] estimation options**.

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype***)** specifies the type of standard error reported, which
includes types that are derived from asymptotic theory (**oim**, **opg**),
that are robust to some kinds of misspecification (**robust**), that
allow for intragroup correlation (**cluster** *clustvar*), and that use
bootstrap or jackknife methods (**bootstrap**, **jackknife**); see **[R]**
*vce_option*.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

**irr** reports estimated coefficients transformed to incidence-rate ratios.
Standard errors and confidence intervals are similarly transformed.
This option affects how results are displayed, not how they are
estimated or stored. **irr** may be specified at estimation or when
replaying previously estimated results.

**nocnsreport**; see **[R] estimation options**.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

+--------------+
----+ Maximization +-----------------------------------------------------

*maximize_options*: __dif__**ficult**, __tech__**nique(***algorithm_spec***)**, __iter__**ate(***#***)**,
[__no__]__lo__**g**, __tr__**ace**, __grad__**ient**, **showstep**, __hess__**ian**, __showtol__**erance**,
__tol__**erance(***#***)**, __ltol__**erance(***#***)**, __nrtol__**erance(***#***)**, __nonrtol__**erance**,
**from(***init_specs***)**; see **[R] maximize**. These options are seldom used.

Setting the optimization type to **technique(bhhh)** resets the default
*vcetype* to **vce(opg)**.

The following option is available with **gnbreg** but is not shown in the
dialog box:

**coeflegend**; see **[R] estimation options**.

__Remarks__

**nbreg** will fit two different parameterizations of the negative binomial
model. The default, given by the **dispersion(mean)** option, has dispersion
for the ith observation equal to 1 + alpha*exp(x_jb + offset_j); that is,
the dispersion is a function of the expected mean of the counts for the
jth observation. The alternative parameterization, given by the
**dispersion(constant)** option, has dispersion equal to 1 + delta; that is,
it is a constant for all observations.

For the default model, alpha = 0 (or ln(alpha) = -infinity) corresponds
to dispersion = 1, and, thus, it is simply a Poisson model. Likewise,
for the alternative parameterization, delta = 0 (or ln(delta) =
-infinity) corresponds to dispersion = 1, and it is simply a Poisson
model.

Users may want to fit both parameterizations and choose the one with the
larger (least negative) log likelihood. Both parameterizations will
yield similar results, and the parameterizations will usually not
significantly differ from each other. Hence, the choice of
parameterization is usually not important.

See **[XT] xtpoisson** and **[XT] xtnbreg** for closely related panel estimators.

__Examples__

Setup
**. webuse rod93**
**. generate logexp=ln(exposure)**

Fit a negative binomial regression model
**. nbreg deaths i.cohort, exposure(exp)**

Same as above command
**. nbreg deaths i.cohort, offset(logexp)**

Same as above command, but change dispersion from **mean** to **constant**
**. nbreg deaths i.cohort, offset(logexp) dispersion(constant)**

Fit a generalized negative binomial model
**. gnbreg deaths age_mos, lnalpha(i.cohort) offset(logexp)**

__Stored results__

**nbreg** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(k)** number of parameters
**e(k_aux)** number of auxiliary parameters
**e(k_eq)** number of equations in **e(b)**
**e(k_eq_model)** number of equations in overall model test
**e(k_dv)** number of dependent variables
**e(df_m)** model degrees of freedom
**e(r2_p)** pseudo-R-squared
**e(ll)** log likelihood
**e(ll_0)** log likelihood, constant-only model
**e(ll_c)** log likelihood, comparison model
**e(alpha)** value of alpha
**e(delta)** value of delta
**e(N_clust)** number of clusters
**e(chi2)** chi-squared
**e(chi2_c)** chi-squared for comparison test
**e(p)** p-value for model test
**e(rank)** rank of **e(V)**
**e(rank0)** rank of **e(V)** for constant-only model
**e(ic)** number of iterations
**e(rc)** return code
**e(converged)** **1** if converged, **0** otherwise

Macros
**e(cmd)** **nbreg**
**e(cmdline)** command as typed
**e(depvar)** name of dependent variable
**e(wtype)** weight type
**e(wexp)** weight expression
**e(title)** title in estimation output
**e(clustvar)** name of cluster variable
**e(offset)** linear offset variable
**e(chi2type)** **Wald** or **LR**; type of model chi-squared test
**e(chi2_ct)** **Wald** or **LR**; type of model chi-squared test
corresponding to **e(chi2_c)**
**e(dispers)** **mean** or **constant**
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(opt)** type of optimization
**e(which)** **max** or **min**; whether optimizer is to perform
maximization or minimization
**e(ml_method)** type of **ml** method
**e(user)** name of likelihood-evaluator program
**e(technique)** maximization technique
**e(properties)** **b V**
**e(predict)** program used to implement **predict**
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(Cns)** constraints matrix
**e(ilog)** iteration log (up to 20 iterations)
**e(gradient)** gradient vector
**e(V)** variance-covariance matrix of the estimators
**e(V_modelbased)** model-based variance

Functions
**e(sample)** marks estimation sample

**gnbreg** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(k)** number of parameters
**e(k_eq)** number of equations in **e(b)**
**e(k_eq_model)** number of equations in overall model test
**e(k_dv)** number of dependent variables
**e(df_m)** model degrees of freedom
**e(r2_p)** pseudo-R-squared
**e(ll)** log likelihood
**e(ll_0)** log likelihood, constant-only model
**e(N_clust)** number of clusters
**e(chi2)** chi-squared
**e(p)** p-value for model test
**e(rank)** rank of **e(V)**
**e(rank0)** rank of **e(V)** for constant-only model
**e(ic)** number of iterations
**e(rc)** return code
**e(converged)** **1** if converged, **0** otherwise

Macros
**e(cmd)** **gnbreg**
**e(cmdline)** command as typed
**e(depvar)** name of dependent variable
**e(wtype)** weight type
**e(wexp)** weight expression
**e(title)** title in estimation output
**e(clustvar)** name of cluster variable
**e(offset1)** linear offset variable
**e(chi2type)** **Wald** or **LR**; type of model chi-squared test
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(opt)** type of optimization
**e(which)** **max** or **min**; whether optimizer is to perform
maximization or minimization
**e(ml_method)** type of **ml** method
**e(user)** name of likelihood-evaluator program
**e(technique)** maximization technique
**e(properties)** **b V**
**e(predict)** program used to implement **predict**
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(Cns)** constraints matrix
**e(ilog)** iteration log (up to 20 iterations)
**e(gradient)** gradient vector
**e(V)** variance-covariance matrix of the estimators
**e(V_modelbased)** model-based variance

Functions
**e(sample)** marks estimation sample