help mfp dialog: mfp
also see: mfp postestimation
-------------------------------------------------------------------------------
Title
[R] mfp -- Multivariable fractional polynomial models
Syntax
mfp [, options] : regression_cmd [yvar1 [yvar2]] xvarlist [if] [in] [
weight] [, regression_cmd_options]
options description
-------------------------------------------------------------------------
Model 2
sequential use the Royston and Altman model-selection
algorithm; default uses closed-test
procedure
cycles(#) maximum number of iteration cycles; default
is cycles(5)
dfdefault(#) default maximum degrees of freedom; default
is dfdefault(4)
center(cent_list) specification of centering for the indepdent
variables
alpha(alpha_list) p-values for testing between FP models;
default is alpha(0.05)
df(df_list) degrees of freedom for each predictor
powers(numlist) list of FP powers to use; default is
powers(-2 -1(.5)1 2 3)
Adv. model
xorder(+|-|n) order of entry into model-selection
algorithm; default is xorder(+)
select(select_list) nominal p-values for selection on each
predictor
xpowers(xp_list) FP powers for each predictor
zero(varlist) treat nonpositive values of specified
predictors as zero when FP is transformed
catzero(varlist) add indicator variable for specified
predictors
Reporting
level(#) set confidence level; default is level(95)
all include out-of-sample observations in
generated variables
-------------------------------------------------------------------------
regression_cmd_options description
-------------------------------------------------------------------------
Adv. model
regression_cmd_options options appropriate to the regression command
in use
-------------------------------------------------------------------------
All weight types supported by regression_cmd are allowed; see weight.
See [R] mfp postestimation for features available after estimation.
fracgen may be used to create new variables containing fractional
polynomial powers. See [R] fracpoly.
where
regression_cmd may be clogit, glm, intreg, logistic, logit, mlogit,
nbreg, ologit, oprobit, poisson, probit, qreg, regress, rreg, stcox,
streg, or xtgee.
yvar1 is not allowed for streg and stcox. For these commands, you
must first stset your data.
yvar1 and yvar2 must both be specified when regression_cmd is intreg.
xvarlist has elements of type varlist and/or (varlist); e.g.,
x1 x2 (x3 x4 x5)
Elements enclosed in parentheses are tested jointly for inclusion in
the model and are not eligible for fractional polynomial
transformation.
Menu
Statistics > Linear models and related > Fractional polynomials >
Multivariable fractional polynomial models
Description
mfp selects the multivariable fractional polynomial (MFP) model that best
predicts the outcome variable from the right-hand-side variables in
xvarlist.
Options
+---------+
----+ Model 2 +----------------------------------------------------------
sequential chooses the sequential fractional polynomial (FP) selection
algorithm.
cycles(#) sets the maximum number of iteration cycles permitted.
cycles(5) is the default.
dfdefault(#) determines the default maximum degrees of freedom (df) for a
predictor. The default is dfdefault(4) (second-degree FP).
center(cent_list) defines the centering of the covariates xvar1, xvar2,
... of xvarlist. The default is center(mean), except for binary
covariates, where it is center(#), with # being the lower of the two
distinct values of the covariate. A typical item in cent_list is
varlist:{mean|#|no}. Items are separated by commas. The first item
is special in that varlist is optional, and if it is omitted, the
default is reset to the specified value (mean, #, or no). For
example, center(no, age:mean) sets the default to no (i.e., no
centering) and the centering for age to mean.
alpha(alpha_list) sets the significance levels for testing between FP
models of different degrees. The rules for alpha_list are the same as
for df_list in the df() option. The default nominal p-value
(significance level, selection level) is 0.05 for all variables.
Example: alpha(0.01) specifies that all variables have an FP
selection level of 1%.
Example: alpha(0.05, weight:0.1) specifies that all variables except
weight have an FP selection level of 5%; weight has a level of 10%.
df(df_list) sets the df for each predictor. The df (not counting the
regression constant, _cons) is twice the degree of the FP, so, for
example, an xvar fit as a second-degree FP (FP2) has 4 df. The first
item in df_list may be either # or varlist:#. Subsequent items must
be varlist:#. Items are separated by commas, and varlist is
specified in the usual way for variables. With the first type of
item, the df for all predictors is taken to be #. With the second
type of item, all members of varlist (which must be a subset of
xvarlist) have # df.
The default number of degrees of freedom for a predictor of type
varlist specified in xvarlist but not in df_list is assigned
according to the number of distinct (unique) values of the predictor,
as follows:
-------------------------------------------
# of distinct values default df
-------------------------------------------
1 (invalid predictor)
2-3 1
4-5 min(2, dfdefault())
>6 dfdefault()
-------------------------------------------
Example: df(4)
All variables have 4 df.
Example: df(2, weight displ:4)
weight and displ have 4 df; all other variables have 2 df.
Example: df(weight displ:4, mpg:2)
weight and displ have 4 df, mpg has 2 df, all other variables have
default df.
powers(numlist) is the set of FP powers to be used. The default set is
-2, -1, -0.5, 0, 0.5, 1, 2, 3 (0 means log).
+------------+
----+ Adv. model +-------------------------------------------------------
xorder(+|-|n) determines the order of entry of the covariates into the
model-selection algorithm. The default is xorder(+), which enters
them in decreasing order of significance in a multiple linear
regression (most significant first). xorder(-) places them in reverse
significance order, whereas xorder(n) respects the original order in
xvarlist.
select(select_list) sets the nominal p-values (significance levels) for
variable selection by backward elimination. A variable is dropped if
its removal causes a nonsignificant increase in deviance. The rules
for select_list are the same as those for df_list in the df() option.
Using the default selection level of 1 for all variables forces them
all into the model. Setting the nominal p-value to be 1 for a given
variable forces it into the model, leaving others to be selected or
not. The nominal p-value for elements of xvarlist bound by
parentheses is specified by including (varlist) in select_list.
Example: select(0.05)
All variables have a nominal p-value of 5%.
Example: select(0.05, weight:1)
All variables except weight have a nominal p-value of 5%; weight is
forced into the model.
Example: select(a (b c):0.05)
All variables except a, b, and c are forced into the model. b and c
are tested jointly with 2 df at the 5% level, and a is tested singly
at the 5% level.
xpowers(xp_list) sets the permitted FP powers for covariates
individually. The rules for xp_list are the same as for df_list in
the df() option. The default selection is the same as that for the
powers() option.
Example: xpowers(-1 0 1)
All variables have powers -1, 0, 1.
Example: xpowers(x5:-1 0 1)
All variables except x5 have default powers; x5 has powers -1, 0, 1.
zero(varlist) treats negative and zero values of members of varlist as
zero when FP transformations are applied. By default, such variables
are subjected to a preliminary linear transformation to avoid
negative and zero values (see [R] fracpoly). varlist must be part of
xvarlist.
catzero(varlist) is a variation on zero(). varlist must be part of
xvarlist.
regression_cmd_options may be any of the options appropriate to
regression_cmd.
+-----------+
----+ Reporting +--------------------------------------------------------
level(#) specifies the confidence level, as a percentage, for confidence
intervals. The default is level(95) or as set by set level.
all includes out-of-sample observations when generating the FP variables.
By default, the generated FP variables contain missing values outside
the estimation sample.
Remarks
For elements in xvarlist not enclosed in parentheses, mfp leaves
variables in the data named Ixvar__1, Ixvar__2, ..., where xvar
represents the first four letters of the name of xvar1, and so on for
xvar2, xvar3, etc. The new variables contain the best-fitting FP powers
of xvar1, xvar2, ....
Examples
---------------------------------------------------------------------------
. sysuse auto
. mfp: regress mpg weight displacement foreign
. mfp, df(1, weight displ:4): regress mpg weight displacement foreign
. mfp, select(0.05, foreign:1) df(2, foreign:1): regress mpg weight
displacement foreign
---------------------------------------------------------------------------
. webuse brcancer, clear
. stset rectime, fail(censrec)
. mfp, alpha(.05) select(.05, hormon:1): stcox x1 x2 x3 x4a x4b x5 x6 x7
hormon, nohr
---------------------------------------------------------------------------
Saved results
In addition to what regression_cmd saves, mfp saves the following in e():
Scalars
e(fp_nx) number of predictors in xvarlist
e(fp_dev) deviance of final model fit
e(Fp_id#) initial degrees of freedom for the #th element of
xvarlist
e(Fp_fd#) final degrees of freedom for the #th element of xvarlist
e(Fp_al#) FP selection level for the #th element of xvarlist
e(Fp_se#) backward elimination selection level for the #th element
of xvarlist
Macros
e(fp_cmd) fracpoly
e(fp_cmd2) mfp
e(cmdline) command as typed
e(fp_fvl) variables in final model
e(fp_depv) yvar1 (yvar2)
e(fp_opts) estimation command options
e(fp_x1) first variable in xvarlist
e(fp_x2) second variable in xvarlist
...
e(fp_xN) last variable in xvarlist, N=e(fp_nx)
e(fp_k1) power for first variable in xvarlist (*)
e(fp_k2) power for second variable in xvarlist (*)
...
e(fp_kN) power for last var. in xvarlist (*), N=e(fp_nx)
Note: (*) contains `.' if the variable is not selected in the final
model.
Also see
Manual: [R] mfp
Help: [R] mfp postestimation;
[R] fracpoly