**[P] _rmcoll** -- Remove collinear variables

__Syntax__

Identify variables to be omitted because of collinearity

**_rmcoll** *varlist* [*if*] [*in*] [*weight*] [**,** __nocons__**tant** __coll__**inear** __exp__**and**
__force__**drop**]

Identify independent variables to be omitted because of collinearity

**_rmdcoll** *depvar* *indepvars* [*if*] [*in*] [*weight*] [**,** __nocons__**tant** __coll__**inear**
__exp__**and** **normcoll**]

*varlist* and *indepvars* may contain factor variables; see fvvarlist.
*varlist*, *depvar*, and *indepvars* may contain time-series operators; see
tsvarlist.
**fweight**s, **aweight**s, **iweight**s, and **pweight**s are allowed; see weight.

__Description__

**_rmcoll** returns in **r(varlist)** an updated version of *varlist* that is
specific to the sample identified by **if**, **in**, and any missing values in
*varlist*. **_rmcoll** flags variables that are to be omitted because of
collinearity. If *varlist* contains factor variables, then **_rmcoll** also
enumerates the levels of factor variables, identifies the base levels of
factor variables, and identifies empty cells in interactions.

The following message is displayed for each variable that **_rmcoll** flags
as omitted because of collinearity:

note: ______ omitted because of collinearity

The following message is displayed for each empty cell of an interaction
that **_rmcoll** encounters:

note: ______ identifies no observations in the sample

**ml** users: it is not necessary to call **_rmcoll** because **ml** flags collinear
variables for you, assuming that you do not specify **ml** **model**'s **collinear**
option. Even so, **ml** programmers sometimes use **_rmcoll** because they need
the sample-specific set of variables, and in such cases, they specify **ml**
**model**'s **collinear** option so that **ml** does not waste time looking for
collinearity again.

**_rmdcoll** performs the same task as **_rmcoll** and checks that *depvar* is not
collinear with the variables in *indepvars*. If *depvar* is collinear with
any of the variables in *indepvars*, then **_rmdcoll** reports the following
message with the 459 error code:

______ collinear with ______

__Options__

**noconstant** specifies that, in looking for collinearity, an intercept not
be included. That is, a variable that contains the same nonzero
value in every observation should not be considered collinear.

**collinear** specifies that collinear variables not be flagged.

**expand** specifies that the expanded, level-specific variables be posted to
**r(varlist)**. This option will have an effect only if there are factor
variables in the variable list.

**forcedrop** specifies that collinear variables be dropped from the variable
list instead of being flagged. This option is not allowed when the
variable list already contains flagged variables, factor variables,
or interactions.

**normcoll** specifies that collinear variables have already been flagged in
*indepvars*. Otherwise, **_rmcoll** is called first to flag any such
collinearity.

__Remarks__

**_rmcoll** and **_rmdcoll** are typically used when writing estimation commands.

**_rmcoll** is used if the programmer wants to flag the collinear variables
from the independent variables.

**_rmdcoll** is used if the programmer wants to detect collinearity of the
dependent variable with the independent variables.

__Examples__

---------------------------------------------------------------------------
Setup
**. webuse auto**
**. generate tt = turn + trunk**

Use **_rmcoll** to identify that we have a collinearity and flag a variable
because of it
**. _rmcoll turn trunk tt**
**. display r(varlist)**

Pass a factor variable to **_rmcoll**
**. _rmcoll i.rep78**
**. display r(varlist)**

Add the **expand** option to loop over the level-specific, individual
variables in **r(varlist)**
**. _rmcoll i.rep78, expand**
**. display r(varlist)**

---------------------------------------------------------------------------

A code fragment for a program that uses **_rmcoll** might read

*...*
**syntax varlist [fweight iweight]** *...* **[, noCONStant** *...* **]**
**marksample touse**
**if "`weight'" != "" {**
**tempvar w**
**quietly gen double `w' = `exp' if `touse'**
**local wgt [`weight'=`w']**
**}**
**else local wgt** */* is nothing */*
**gettoken depvar xvars : varlist**
**_rmcoll `xvars' `wgt' if `touse', `constant'**
**local xvars `r(varlist)'**
*...*

In this code fragment, **varlist** contains one dependent variable and zero
or more independent variables. The dependent variable is split off and
stored in the local macro **depvar**. Then the remaining variables are
passed through **_rmcoll**, and the resulting updated independent variable
list is stored in the local macro **xvars**.

---------------------------------------------------------------------------

__Stored results__

**_rmcoll** and **_rmdcoll** store the following in **r()**:

Scalars
**r(k_omitted)** number of omitted variables in **r(varlist)**

Macros
**r(varlist)** the flagged and expanded variable list