**[XT] xtprobit** -- Random-effects and population-averaged probit models

__Syntax__

Random-effects (RE) model

**xtprobit** *depvar* [*indepvars*] [*if*] [*in*] [*weight*] [**, re** *RE_options*]

Population-averaged (PA) model

**xtprobit** *depvar* [*indepvars*] [*if*] [*in*] [*weight*] **, pa** [*PA_options*]

*RE_options* Description
-------------------------------------------------------------------------
Model
__nocons__**tant** suppress constant term
**re** use random-effects estimator; the default
__off__**set(***varname***)** include *varname* in model with coefficient
constrained to 1
__const__**raints(***constraints***)** apply specified linear constraints
__col__**linear** keep collinear variables
**asis** retain perfect predictor variables

SE/Robust
**vce(***vcetype***)** *vcetype* may be **oim**, __r__**obust**, __cl__**uster**
*clustvar*, __boot__**strap**, or __jack__**knife**

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
**lrmodel** perform the likelihood-ratio model test
instead of the default Wald test
__nocnsr__**eport** do not display constraints
*display_options* control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling

Integration
__intm__**ethod(***intmethod***)** integration method; *intmethod* may be
__mv__**aghermite** (the default) or __gh__**ermite**
__intp__**oints(***#***)** use # quadrature points; default is
**intpoints(12)**

Maximization
*maximize_options* control the maximization process; seldom
used

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------

*PA_options* Description
-------------------------------------------------------------------------
Model
__nocons__**tant** suppress constant term
**pa** use population-averaged estimator
__off__**set(***varname***)** include *varname* in model with coefficient
constrained to 1
**asis** retain perfect predictor variables

Correlation
__c__**orr(***correlation***)** within-panel correlation structure
**force** estimate even if observations unequally
spaced in time

SE/Robust
**vce(***vcetype***)** *vcetype* may be **conventional**, __r__**obust**,
__boot__**strap**, or __jack__**knife**
**nmp** use divisor N-P instead of the default N
__s__**cale(***parm***)** override the default scale parameter; *parm*
may be **x2**, **dev**, **phi**, or *#*

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
*display_options* control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling

Optimization
*optimize_options* control the optimization process; seldom
used

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------

*correlation* Description
-------------------------------------------------------------------------
__exc__**hangeable** exchangeable
__ind__**ependent** independent
__uns__**tructured** unstructured
__fix__**ed** *matname* user-specified
**ar** *#* autoregressive of order *#*
__sta__**tionary** *#* stationary of order *#*
__non__**stationary** *#* nonstationary of order *#*
-------------------------------------------------------------------------

A panel variable must be specified. For **xtprobit, pa**, correlation
structures other than **exchangeable** and **independent** require that a time
variable also be specified. Use **xtset**.
*indepvars* may contain factor variables; see fvvarlist.
*depvar* and *indepvars* may contain time-series operators; see tsvarlist.
**by**, **mi estimate**, and **statsby** are allowed; see prefix. **fp** is allowed for
the random-effects model.
**vce(bootstrap)** and **vce(jackknife)** are not allowed with the **mi estimate**
prefix.
**iweight**s, **fweight**s, and **pweight**s are allowed for the population-averaged
model, and **iweight**s are allowed for the random-effects model; see
weight. Weights must be constant within panel.
**coeflegend** does not appear in the dialog box.
See **[XT] xtprobit postestimation** for features available after estimation.

__Menu__

**Statistics > Longitudinal/panel data > Binary outcomes >** **Probit**
**regression (RE, PA)**

__Description__

**xtprobit** fits random-effects and population-averaged probit models for a
binary dependent variable. The probability of a positive outcome is
assumed to be determined by the standard normal cumulative distribution
function.

__Options for RE model__

+-------+
----+ Model +------------------------------------------------------------

**noconstant**; see **[R] estimation options**.

**re** requests the random-effects estimator. **re** is the default if neither
**re** nor **pa** is specified.

**offset(***varname***)**, **constraints(***constraints***)**, **collinear**; see **[R] estimation**
**options**.

**asis** forces retention of perfect predictor variables and their
associated, perfectly predicted observations and may produce
instabilities in maximization; see **[R] probit**.

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype***)** specifies the type of standard error reported, which
includes types that are derived from asymptotic theory (**oim**), that
are robust to some kinds of misspecification (**robust**), that allow for
intragroup correlation (**cluster** *clustvar*), and that use bootstrap or
jackknife methods (**bootstrap**, **jackknife**); see **[XT] ***vce_options*.

Specifying **vce(robust)** is equivalent to specifying **vce(cluster**
*panelvar***)**; see *xtprobit, re and the robust VCE estimator* in *Methods*
*and formulas* of **[XT] xtprobit**.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**, **lrmodel**, **nocnsreport**; see **[R] estimation options**.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

+-------------+
----+ Integration +------------------------------------------------------

**intmethod(***intmethod***)**, **intpoints(***#***)**; see **[R] estimation options**.

+--------------+
----+ Maximization +-----------------------------------------------------

*maximize_options*: __dif__**ficult**, __tech__**nique(***algorithm_spec***)**, __iter__**ate(***#***)**,
[__no__]__lo__**g**, __tr__**ace**, __grad__**ient**, **showstep**, __hess__**ian**, __showtol__**erance**,
__tol__**erance(***#***)**, __ltol__**erance(***#***)**, __nrtol__**erance(***#***)**, __nonrtol__**erance**, and
**from(***init_specs***)**; see **[R] maximize**. These options are seldom used.

The following option is available with **xtprobit** but is not shown in the
dialog box:

**coeflegend**; see **[R] estimation options**.

__Options for PA model__

+-------+
----+ Model +------------------------------------------------------------

**noconstant**; see **[R] estimation options**.

**pa** requests the population-averaged estimator.

**offset(***varname***)**; see **[R] estimation options**.

**asis** forces retention of perfect predictor variables and their
associated, perfectly predicted observations and may produce
instabilities in maximization; see **[R] probit**.

+-------------+
----+ Correlation +------------------------------------------------------

**corr(***correlation***)** specifies the within-panel correlation structure; the
default corresponds to the equal-correlation model,
**corr(exchangeable)**.

When you specify a correlation structure that requires a lag, you
indicate the lag after the structure's name with or without a blank;
for example, **corr(ar 1)** or **corr(ar1)**.

If you specify the fixed correlation structure, you specify the name
of the matrix containing the assumed correlations following the word
**fixed**, for example, **corr(fixed myr)**.

**force** specifies that estimation be forced even though the time variable
is not equally spaced. This is relevant only for correlation
structures that require knowledge of the time variable. These
correlation structures require that observations be equally spaced so
that calculations based on lags correspond to a constant time change.
If you specify a time variable indicating that observations are not
equally spaced, the (time dependent) model will not be fit. If you
also specify **force**, the model will be fit, and it will be assumed
that the lags based on the data ordered by the time variable are
appropriate.

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype***)** specifies the type of standard error reported, which
includes types that are derived from asymptotic theory
(**conventional**), that are robust to some kinds of misspecification
(**robust**), and that use bootstrap or jackknife methods (**bootstrap**,
**jackknife**); see **[XT] ***vce_options*.

**vce(conventional)**, the default, uses the conventionally derived
variance estimator for generalized least-squares regression.

**nmp**, **scale(x2**|**dev**|**phi**|*#***)**; see **[XT] ***vce_options*.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

+--------------+
----+ Optimization +-----------------------------------------------------

*optimize_options* control the iterative optimization process. These
options are seldom used.

__iter__**ate(***#***)** specifies the maximum number of iterations. When the
number of iterations equals #, the optimization stops and presents
the current results, even if the convergence tolerance has not been
reached. The default is **iterate(100)**.

__tol__**erance(***#***)** specifies the tolerance for the coefficient vector.
When the relative change in the coefficient vector from one iteration
to the next is less than or equal to #, the optimization process is
stopped. **tolerance(1e-6)** is the default.

**nolog** suppress the display of the iteration log.

__tr__**ace** specifies that the current estimates be printed at each
iteration.

The following option is available with **xtprobit** but is not shown in the
dialog box:

**coeflegend**; see **[R] estimation options**.

__Technical note__

The random-effects model is calculated using quadrature, which is an
approximation whose accuracy depends partially on the number of
integration points used. We can use the **quadchk** command to see if
changing the number of integration points affects the results. If the
results change, the quadrature approximation is not accurate given the
number of integration points. Try increasing the number of integration
points using the **intpoints()** option and again run **quadchk**. Do not
attempt to interpret the results of estimates when the coefficients
reported by **quadchk** differ substantially. See **[XT] quadchk** for details
and **[XT] xtprobit** for an example.

Because the **xtprobit, re** likelihood function is calculated by
Gauss-Hermite quadrature, on large problems, the computations can be
slow. Computation time is roughly proportional to the number of points
used for the quadrature.

__Examples__

Setup
**. webuse union**

Random-effects model
**. xtprobit union age grade i.not_smsa south##c.year**

Equal-correlation population-averaged model
**. xtprobit union age grade i.not_smsa south##c.year, pa**

Equal-correlation population-averaged model with robust variance
**. xtprobit union age grade i.not_smsa south##c.year, pa** **vce(robust)**

__Stored results__

**xtprobit, re** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(N_g)** number of groups
**e(k)** number of parameters
**e(k_aux)** number of auxiliary parameters
**e(k_eq)** number of equations in **e(b)**
**e(k_eq_model)** number of equations in overall model test
**e(k_dv)** number of dependent variables
**e(df_m)** model degrees of freedom
**e(ll)** log likelihood
**e(ll_0)** log likelihood, constant-only model
**e(ll_c)** log likelihood, comparison model
**e(chi2)** chi-squared
**e(chi2_c)** chi-squared for comparison test
**e(N_clust)** number of clusters
**e(rho)** rho
**e(sigma_u)** panel-level standard deviation
**e(n_quad)** number of quadrature points
**e(g_min)** smallest group size
**e(g_avg)** average group size
**e(g_max)** largest group size
**e(p)** p-value for model test
**e(rank)** rank of **e(V)**
**e(rank0)** rank of **e(V)** for constant-only model
**e(ic)** number of iterations
**e(rc)** return code
**e(converged)** **1** if converged, **0** otherwise

Macros
**e(cmd)** **xtprobit**
**e(cmdline)** command as typed
**e(depvar)** name of dependent variable
**e(ivar)** variable denoting groups
**e(model)** **re**
**e(wtype)** weight type
**e(wexp)** weight expression
**e(title)** title in estimation output
**e(clustvar)** name of cluster variable
**e(offset)** linear offset variable
**e(chi2type)** **Wald** or **LR**; type of model chi-squared test
**e(chi2_ct)** **Wald** or **LR**; type of model chi-squared test
corresponding to **e(chi2_c)**
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(intmethod)** integration method
**e(distrib)** **Gaussian**; the distribution of the random effect
**e(opt)** type of optimization
**e(which)** **max** or **min**; whether optimizer is to perform
maximization or minimization
**e(ml_method)** type of **ml** method
**e(user)** name of likelihood-evaluator program
**e(technique)** maximization technique
**e(properties)** **b V**
**e(predict)** program used to implement **predict**
**e(marginsdefault)** default **predict()** specification for **margins**
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(Cns)** constraints matrix
**e(ilog)** iteration log
**e(gradient)** gradient vector
**e(V)** variance-covariance matrix of the estimators
**e(V_modelbased)** model-based variance

Functions
**e(sample)** marks estimation sample

**xtprobit, pa** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(N_g)** number of groups
**e(df_m)** model degrees of freedom
**e(chi2)** chi-squared
**e(p)** p-value for model test
**e(df_pear)** degrees of freedom from Pearson chi-squared
**e(chi2_dev)** chi-squared test of deviance
**e(chi2_dis)** chi-squared test of deviance dispersion
**e(deviance)** deviance
**e(dispers)** deviance dispersion
**e(phi)** scale parameter
**e(g_min)** smallest group size
**e(g_avg)** average group size
**e(g_max)** largest group size
**e(rank)** rank of **e(V)**
**e(tol)** target tolerance
**e(dif)** achieved tolerance
**e(rc)** return code

Macros
**e(cmd)** **xtgee**
**e(cmd2)** **xtprobit**
**e(cmdline)** command as typed
**e(depvar)** name of dependent variable
**e(ivar)** variable denoting groups
**e(tvar)** variable denoting time within groups
**e(model)** **pa**
**e(family)** **binomial**
**e(link)** **probit**; link function
**e(corr)** correlation structure
**e(scale)** **x2**, **dev**, **phi**, or *#*; scale parameter
**e(wtype)** weight type
**e(wexp)** weight expression
**e(offset)** linear offset variable
**e(chi2type)** **Wald**; type of model chi-squared test
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(nmp)** **nmp**, if specified
**e(properties)** **b V**
**e(predict)** program used to implement **predict**
**e(marginsnotok)** predictions disallowed by **margins**
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(R)** estimated working correlation matrix
**e(V)** variance-covariance matrix of the estimators
**e(V_modelbased)** model-based variance

Functions
**e(sample)** marks estimation sample