Stata 15 help for mfp

[R] mfp -- Multivariable fractional polynomial models


mfp [, options] : regression_cmd [yvar1 [yvar2]] xvarlist [if] [in] [weight] [, regression_cmd_options]

options Description ------------------------------------------------------------------------- Model 2 sequential use the Royston and Altman model-selection algorithm; default uses closed-test procedure cycles(#) maximum number of iteration cycles; default is cycles(5) dfdefault(#) default maximum degrees of freedom; default is dfdefault(4) center(cent_list) specification of centering for the independent variables alpha(alpha_list) p-values for testing between FP models; default is alpha(0.05) df(df_list) degrees of freedom for each predictor powers(numlist) list of FP powers to use; default is powers(-2 -1(.5)1 2 3)

Adv. model xorder(+|-|n) order of entry into model-selection algorithm; default is xorder(+) select(select_list) nominal p-values for selection on each predictor xpowers(xp_list) FP powers for each predictor zero(varlist) treat nonpositive values of specified predictors as zero when FP is transformed catzero(varlist) add indicator variable for specified predictors all include out-of-sample observations in generated variables

Reporting level(#) set confidence level; default is level(95) display_options control column formats and line width -------------------------------------------------------------------------

regression_cmd_options Description ------------------------------------------------------------------------- Adv. model regression_cmd_options options appropriate to the regression command in use -------------------------------------------------------------------------

All weight types supported by regression_cmd are allowed; see weight. See [R] mfp postestimation for features available after estimation. fp generate may be used to create new variables containing fractional polynomial powers. See [R] fp.


regression_cmd may be clogit, glm, intreg, logistic, logit, mlogit, nbreg, ologit, oprobit, poisson, probit, qreg, regress, rreg, stcox, stcrreg, streg, or xtgee.

yvar1 is not allowed for streg, stcrreg, and stcox. For these commands, you must first stset your data.

yvar1 and yvar2 must both be specified when regression_cmd is intreg.

xvarlist has elements of type varlist and/or (varlist); for example,

x1 x2 (x3 x4 x5)

Elements enclosed in parentheses are tested jointly for inclusion in the model and are not eligible for fractional polynomial transformation.


Statistics > Linear models and related > Fractional polynomials > Multivariable fractional polynomial models


mfp selects the multivariable fractional polynomial (MFP) model that best predicts the outcome variable from the right-hand-side variables in xvarlist.

For univariate fractional polynomials, fp can be used to fit a wider range of models than mfp. See [R] fp for more details.


+---------+ ----+ Model 2 +----------------------------------------------------------

sequential chooses the sequential fractional polynomial (FP) selection algorithm (see Methods of FP model selection in [R] mfp).

cycles(#) sets the maximum number of iteration cycles permitted. cycles(5) is the default.

dfdefault(#) determines the default maximum degrees of freedom (df) for a predictor. The default is dfdefault(4) (second-degree FP).

center(cent_list) defines the centering of the covariates xvar1, xvar2, ... of xvarlist. The default is center(mean), except for binary covariates, where it is center(#), with # being the lower of the two distinct values of the covariate. A typical item in cent_list is varlist:{mean|#|no}. Items are separated by commas. The first item is special in that varlist is optional, and if it is omitted, the default is reset to the specified value (mean, #, or no). For example, center(no, age:mean) sets the default to no (that is, no centering) and the centering of age to mean.

alpha(alpha_list) sets the significance levels for testing between FP models of different degrees. The rules for alpha_list are the same as those for df_list in the df() option. The default nominal p-value (significance level, selection level) is 0.05 for all variables.

Example: alpha(0.01) specifies that all variables have an FP selection level of 1%.

Example: alpha(0.05, weight:0.1) specifies that all variables except weight have an FP selection level of 5%; weight has a level of 10%.

df(df_list) sets the df for each predictor. The df (not counting the regression constant, _cons) is twice the degree of the FP, so, for example, an xvar fit as a second-degree FP (FP2) has 4 df. The first item in df_list may be either # or varlist:#. Subsequent items must be varlist:#. Items are separated by commas, and varlist is specified in the usual way for variables. With the first type of item, the df for all predictors is taken to be #. With the second type of item, all members of varlist (which must be a subset of xvarlist) have # df.

The default number of degrees of freedom for a predictor of type varlist specified in xvarlist but not in df_list is assigned according to the number of distinct (unique) values of the predictor, as follows:

------------------------------------------- # of distinct values Default df ------------------------------------------- 1 (invalid predictor) 2-3 1 4-5 min(2, dfdefault()) >6 dfdefault() -------------------------------------------

Example: df(4) All variables have 4 df.

Example: df(2, weight displ:4) weight and displ have 4 df; all other variables have 2 df.

Example: df(weight displ:4, mpg:2) weight and displ have 4 df, mpg has 2 df; all other variables have default df.

powers(numlist) is the set of FP powers to be used. The default set is -2, -1, -0.5, 0, 0.5, 1, 2, 3 (0 means log).

+------------+ ----+ Adv. model +-------------------------------------------------------

xorder(+|-|n) determines the order of entry of the covariates into the model-selection algorithm. The default is xorder(+), which enters them in decreasing order of significance in a multiple linear regression (most significant first). xorder(-) places them in reverse significance order, whereas xorder(n) respects the original order in xvarlist.

select(select_list) sets the nominal p-values (significance levels) for variable selection by backward elimination. A variable is dropped if its removal causes a nonsignificant increase in deviance. The rules for select_list are the same as those for df_list in the df() option. Using the default selection level of 1 for all variables forces them all into the model. Setting the nominal p-value to be 1 for a given variable forces it into the model, leaving others to be selected or not. The nominal p-value for elements of xvarlist bound by parentheses is specified by including (varlist) in select_list.

Example: select(0.05) All variables have a nominal p-value of 5%.

Example: select(0.05, weight:1) All variables except weight have a nominal p-value of 5%; weight is forced into the model.

Example: select(a (b c):0.05) All variables except a, b, and c are forced into the model. b and c are tested jointly with 2 df at the 5% level, and a is tested singly at the 5% level.

xpowers(xp_list) sets the permitted FP powers for covariates individually. The rules for xp_list are the same as for df_list in the df() option. The default selection is the same as that for the powers() option.

Example: xpowers(-1 0 1) All variables have powers -1, 0, 1.

Example: xpowers(x5:-1 0 1) All variables except x5 have default powers; x5 has powers -1, 0, 1.

zero(varlist) treats negative and zero values of members of varlist as zero when FP transformations are applied. By default, such variables are subjected to a preliminary linear transformation to avoid negative and zero values, as described in the scale option of [R] fp. varlist must be part of xvarlist.

catzero(varlist) is a variation on zero(); see Zeros and zero categories in [R] mfp. varlist must be part of xvarlist.

regression_cmd_options may be any of the options appropriate to regression_cmd.

all includes out-of-sample observations when generating the FP variables. By default, the generated FP variables contain missing values outside the estimation sample.

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#) specifies the confidence level, as a percentage, for confidence intervals. The default is level(95) or as set by set level.

display_options: cformat(%fmt), pformat(%fmt), sformat(%fmt), and nolstretch; see [R] estimation options.


For elements in xvarlist not enclosed in parentheses, mfp leaves variables in the data named Ixvar__1, Ixvar__2, ..., where xvar represents the first four letters of the name of xvar1, and so on for xvar2, xvar3, etc. The new variables contain the best-fitting FP powers of xvar1, xvar2, ....


--------------------------------------------------------------------------- Setup . sysuse auto

Fit MFP regression . mfp: regress mpg weight displacement foreign

Specify 4 df for weight and displacement and 1 df for all other variables . mfp, df(1, weight displ:4): regress mpg weight displacement foreign

Force foreign into the model; set a backward-elimination threshold of 0.05 for all other variables; specify 1 df for foreign and 2 df for the other variables . mfp, select(0.05, foreign:1) df(2, foreign:1): regress mpg weight displacement foreign

--------------------------------------------------------------------------- Setup . webuse brcancer, clear . stset rectime, fail(censrec)

Fit MFP Cox regression; force hormon into the model and set a backward-elimination threshold of 0.05 for the other variables . mfp, select(0.05, hormon:1): stcox x1 x2 x3 x4a x4b x5 x6 x7 hormon, nohr


Stored results

In addition to what regression_cmd stores, mfp stores the following in e():

Scalars e(fp_nx) number of predictors in xvarlist e(fp_dev) deviance of final model fit e(Fp_id#) initial degrees of freedom for the #th element of xvarlist e(Fp_fd#) final degrees of freedom for the #th element of xvarlist e(Fp_al#) FP selection level for the #th element of xvarlist e(Fp_se#) backward elimination selection level for the #th element of xvarlist

Macros e(fp_cmd) fracpoly e(fp_cmd2) mfp e(cmdline) command as typed e(fracpoly) command used to fit the selected model using fracpoly e(fp_fvl) variables in final model e(fp_depv) yvar1 (yvar2) e(fp_opts) estimation command options e(fp_x1) first variable in xvarlist e(fp_x2) second variable in xvarlist ... e(fp_xN) last variable in xvarlist, N=e(fp_nx) e(fp_k1) power for first variable in xvarlist (*) e(fp_k2) power for second variable in xvarlist (*) ... e(fp_kN) power for last var. in xvarlist (*), N=e(fp_nx)

Note: (*) contains `.' if the variable is not selected in the final model.

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index