**[R] mfp** -- Multivariable fractional polynomial models

__Syntax__

**mfp** [**,** *options*] **:** *regression_cmd* [*yvar1* [*yvar2*]] *xvarlist* [*if*] [*in*]
[*weight*] [**,** *regression_cmd_options*]

*options* Description
-------------------------------------------------------------------------
Model 2
__seq__**uential** use the Royston and Altman model-selection
algorithm; default uses closed-test
procedure
__cyc__**les(***#***)** maximum number of iteration cycles; default
is **cycles(5)**
__dfd__**efault(***#***)** default maximum degrees of freedom; default
is **dfdefault(4)**
__cent__**er(***cent_list***)** specification of centering for the
independent variables
__al__**pha(***alpha_list***)** p-values for testing between FP models;
default is **alpha(0.05)**
**df(***df_list***)** degrees of freedom for each predictor
__po__**wers(***numlist***)** list of FP powers to use; default is
**powers(-2 -1(.5)1 2 3)**

Adv. model
__xo__**rder(+**|**-**|**n)** order of entry into model-selection
algorithm; default is **xorder(+)**
__sel__**ect(***select_list***)** nominal p-values for selection on each
predictor
__xp__**owers(***xp_list***)** FP powers for each predictor
__zer__**o(***varlist***)** treat nonpositive values of specified
predictors as zero when FP is transformed
__cat__**zero(***varlist***)** add indicator variable for specified
predictors
**all** include out-of-sample observations in
generated variables

Reporting
__lev__**el(***#***)** set confidence level; default is **level(95)**
*display_options* control column formats and line width
-------------------------------------------------------------------------

*regression_cmd_options* Description
-------------------------------------------------------------------------
Adv. model
*regression_cmd_options* options appropriate to the regression command
in use
-------------------------------------------------------------------------

All weight types supported by *regression_cmd* are allowed; see weight.
See **[R] mfp postestimation** for features available after estimation.
**fp generate** may be used to create new variables containing fractional
polynomial powers. See **[R] fp**.

where

*regression_cmd* may be **clogit**, **glm**, **intreg**, **logistic**, **logit**, **mlogit**,
**nbreg**, **ologit**, **oprobit**, **poisson**, **probit**, **qreg**, **regress**, **rreg**, **stcox**,
**stcrreg**, **streg**, or **xtgee**.

*yvar1* is not allowed for **streg**, **stcrreg**, and **stcox**. For these
commands, you must first **stset** your data.

*yvar1* and *yvar2* must both be specified when *regression_cmd* is **intreg**.

*xvarlist* has elements of type *varlist* and/or **(***varlist***)**; for example,

**x1 x2 (x3 x4 x5)**

Elements enclosed in parentheses are tested jointly for inclusion in
the model and are not eligible for fractional polynomial
transformation.

__Menu__

**Statistics > Linear models and related > Fractional polynomials >**
**Multivariable fractional polynomial models**

__Description__

**mfp** selects the multivariable fractional polynomial (MFP) model that best
predicts the outcome variable from the right-hand-side variables in
*xvarlist*.

For univariate fractional polynomials, **fp** can be used to fit a wider
range of models than **mfp**. See **[R] fp** for more details.

__Options__

+---------+
----+ Model 2 +----------------------------------------------------------

**sequential** chooses the sequential fractional polynomial (FP) selection
algorithm (see *Methods of FP model selection* in **[R] mfp**).

**cycles(***#***)** sets the maximum number of iteration cycles permitted.
**cycles(5)** is the default.

**dfdefault(***#***)** determines the default maximum degrees of freedom (df) for a
predictor. The default is **dfdefault(4)** (second-degree FP).

**center(***cent_list***)** defines the centering of the covariates *xvar1*, *xvar2*,
... of *xvarlist*. The default is **center(mean)**, except for binary
covariates, where it is **center(***#***)**, with *#* being the lower of the two
distinct values of the covariate. A typical item in *cent_list* is
*varlist***:**{**mean**|*#*|**no**}. Items are separated by commas. The first item
is special in that *varlist* is optional, and if it is omitted, the
default is reset to the specified value (**mean**, *#*, or **no**). For
example, **center(no, age:mean)** sets the default to **no** (that is, no
centering) and the centering of **age** to **mean**.

**alpha(***alpha_list***)** sets the significance levels for testing between FP
models of different degrees. The rules for *alpha_list* are the same as
those for *df_list* in the **df()** option. The default nominal p-value
(significance level, selection level) is 0.05 for all variables.

Example: **alpha(0.01)** specifies that all variables have an FP
selection level of 1%.

Example: **alpha(0.05, weight:0.1)** specifies that all variables except
**weight** have an FP selection level of 5%; **weight** has a level of 10%.

**df(***df_list***)** sets the df for each predictor. The df (not counting the
regression constant, **_cons**) is twice the degree of the FP, so, for
example, an *xvar* fit as a second-degree FP (FP2) has 4 df. The first
item in *df_list* may be either *#* or *varlist***:***#*. Subsequent items must
be *varlist***:***#*. Items are separated by commas, and *varlist* is
specified in the usual way for variables. With the first type of
item, the df for all predictors is taken to be *#*. With the second
type of item, all members of *varlist* (which must be a subset of
*xvarlist*) have *#* df.

The default number of degrees of freedom for a predictor of type
*varlist* specified in *xvarlist* but not in *df_list* is assigned
according to the number of distinct (unique) values of the predictor,
as follows:

-------------------------------------------
# of distinct values Default df
-------------------------------------------
1 (invalid predictor)
2-3 1
4-5 min(2, **dfdefault()**)
__>__6 **dfdefault()**
-------------------------------------------

Example: **df(4)**
All variables have 4 df.

Example: **df(2, weight displ:4)**
**weight** and **displ** have 4 df; all other variables have 2 df.

Example: **df(weight displ:4, mpg:2)**
**weight** and **displ** have 4 df, **mpg** has 2 df; all other variables have
default df.

**powers(***numlist***)** is the set of FP powers to be used. The default set is
-2, -1, -0.5, 0, 0.5, 1, 2, 3 (0 means log).

+------------+
----+ Adv. model +-------------------------------------------------------

**xorder(+**|**-**|**n)** determines the order of entry of the covariates into the
model-selection algorithm. The default is **xorder(+)**, which enters
them in decreasing order of significance in a multiple linear
regression (most significant first). **xorder(-)** places them in reverse
significance order, whereas **xorder(n)** respects the original order in
*xvarlist*.

**select(***select_list***)** sets the nominal p-values (significance levels) for
variable selection by backward elimination. A variable is dropped if
its removal causes a nonsignificant increase in deviance. The rules
for *select_list* are the same as those for *df_list* in the **df()** option.
Using the default selection level of 1 for all variables forces them
all into the model. Setting the nominal p-value to be 1 for a given
variable forces it into the model, leaving others to be selected or
not. The nominal p-value for elements of *xvarlist* bound by
parentheses is specified by including **(***varlist***)** in *select_list*.

Example: **select(0.05)**
All variables have a nominal p-value of 5%.

Example: **select(0.05, weight:1)**
All variables except **weight** have a nominal p-value of 5%; **weight** is
forced into the model.

Example: **select(a (b c):0.05)**
All variables except **a**, **b**, and **c** are forced into the model. **b** and **c**
are tested jointly with 2 df at the 5% level, and **a** is tested singly
at the 5% level.

**xpowers(***xp_list***)** sets the permitted FP powers for covariates
individually. The rules for *xp_list* are the same as for *df_list* in
the **df()** option. The default selection is the same as that for the
**powers()** option.

Example: **xpowers(-1 0 1)**
All variables have powers -1, 0, 1.

Example: **xpowers(x5:-1 0 1)**
All variables except **x5** have default powers; **x5** has powers -1, 0, 1.

**zero(***varlist***)** treats negative and zero values of members of *varlist* as
zero when FP transformations are applied. By default, such variables
are subjected to a preliminary linear transformation to avoid
negative and zero values, as described in the **scale** option of **[R] fp**.
*varlist* must be part of *xvarlist*.

**catzero(***varlist***)** is a variation on **zero()**; see *Zeros and zero categories*
in **[R] mfp**. *varlist* must be part of *xvarlist*.

*regression_cmd_options* may be any of the options appropriate to
*regression_cmd*.

**all** includes out-of-sample observations when generating the FP variables.
By default, the generated FP variables contain missing values outside
the estimation sample.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)** specifies the confidence level, as a percentage, for confidence
intervals. The default is **level(95)** or as set by **set level**.

*display_options*: **cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and
**nolstretch**; see **[R] estimation options**.

__Remarks__

For elements in *xvarlist* not enclosed in parentheses, **mfp** leaves
variables in the data named **I***xvar***__1**, **I***xvar***__2**, ..., where *xvar*
represents the first four letters of the name of *xvar1*, and so on for
*xvar2*, *xvar3*, etc. The new variables contain the best-fitting FP powers
of *xvar1*, *xvar2*, ....

__Examples__

---------------------------------------------------------------------------
Setup
**. sysuse auto**

Fit MFP regression
**. mfp: regress mpg weight displacement foreign**

Specify 4 df for weight and displacement and 1 df for all other variables
**. mfp, df(1, weight displ:4): regress mpg weight displacement** **foreign**

Force **foreign** into the model; set a backward-elimination threshold of
0.05 for all other variables; specify 1 df for **foreign** and 2 df for the
other variables
**. mfp, select(0.05, foreign:1) df(2, foreign:1): regress mpg** **weight**
**displacement foreign**

---------------------------------------------------------------------------
Setup
**. webuse brcancer, clear**
**. stset rectime, fail(censrec)**

Fit MFP Cox regression; force hormon into the model and set a
backward-elimination threshold of 0.05 for the other variables
**. mfp, select(0.05, hormon:1): stcox x1 x2 x3 x4a x4b x5 x6 x7**
**hormon, nohr**

---------------------------------------------------------------------------

__Stored results__

In addition to what *regression_cmd* stores, **mfp** stores the following in
**e()**:

Scalars
**e(fp_nx)** number of predictors in *xvarlist*
**e(fp_dev)** deviance of final model fit
**e(Fp_id***#***)** initial degrees of freedom for the *#*th element of
*xvarlist*
**e(Fp_fd***#***)** final degrees of freedom for the *#*th element of *xvarlist*
**e(Fp_al***#***)** FP selection level for the *#*th element of *xvarlist*
**e(Fp_se***#***)** backward elimination selection level for the *#*th element
of *xvarlist*

Macros
**e(fp_cmd)** **fracpoly**
**e(fp_cmd2)** **mfp**
**e(cmdline)** command as typed
**e(fracpoly)** command used to fit the selected model using **fracpoly**
**e(fp_fvl)** variables in final model
**e(fp_depv)** *yvar1* (*yvar2*)
**e(fp_opts)** estimation command options
**e(fp_x1)** first variable in *xvarlist*
**e(fp_x2)** second variable in *xvarlist*
...
**e(fp_x***N***)** last variable in *xvarlist*, N=**e(fp_nx)**
**e(fp_k1)** power for first variable in *xvarlist* (*)
**e(fp_k2)** power for second variable in *xvarlist* (*)
...
**e(fp_k***N***)** power for last var. in *xvarlist* (*), N=**e(fp_nx)**

Note: (*) contains `.' if the variable is not selected in the final
model.