Stata 15 help for xi

[R] xi -- Interaction expansion

Syntax

xi [, prefix(string) noomit] term(s)

xi [, prefix(string) noomit] : any_stata_command varlist_with_terms ...

where a term has the form

i.varname or I.varname i.varname1*i.varname2 I.varname1*I.varname2 i.varname1*varname3 I.varname1*varname3 i.varname1|varname3 I.varname1|varname3

varname, varname1, and varname2 denote numeric or string categorical variables. varname3 denotes a continuous, numeric variable.

Menu

Data > Create or change data > Other variable-creation commands > Interaction expansion

+-----------------------------------------------------------------+ | Most commands in Stata now allow factor variables; see | | fvvarlist. To determine if a command allows factor variables, | | see the information printed below the options table for the | | command. If the command allows factor variables, it will say | | something like "indepvars may contain factor variables." | | | | We recommend that you use factor variables instead of xi if a | | command allows factor variables. | | | | We include [R] xi in our documentation so that readers can | | consult it when using a Stata command that does not allow | | factor variables. | +-----------------------------------------------------------------+

Description

xi expands terms containing categorical variables into indicator (also called dummy) variable sets by creating new variables and, in the second syntax (xi: any_stata_command), executes the specified command with the expanded terms. The dummy variables created are

i.varname creates dummies for categorical variable varname

i.varname1*i.varname2 creates dummies for categorical variables varname1 and varname2: all interactions and main effects

i.varname1*varname3 creates dummies for categorical variable varname1 and continuous variable varname3: all interactions and main effects

i.varname1|varname3 creates dummies for categorical variable varname1 and continuous variable varname3: all interactions and main effect of varname3, but no main effect of varname1

Options

prefix(string) allows you to choose a prefix other than _I for the newly created interaction variables. The prefix cannot be longer than four characters. By default, xi will create interaction variables starting with _I. When you use xi, it drops all previously created interaction variables starting with the prefix specified in the prefix(string) option or with _I by default. Therefore, if you want to keep the variables with a certain prefix, specify a different prefix in the prefix(string) option.

noomit prevents xi from omitting groups. This option provides a way to generate an indicator variable for every category having one or more variables, which is useful when combined with the noconstant option of an estimation command.

Remarks

Remarks are presented under the following headings:

Summary of i.varname Summary of controlling the omitted dummy Interpreting output How xi names variables xi as a command rather than a command prefix Warnings

Summary of i.varname

o varname may be string or numeric.

o Indicator (dummy) variables are created automatically.

o By default, the dummy-variable set is identified by dropping the dummy corresponding to the smallest value of the variable (how to specify otherwise is discussed below).

o The new dummy variables are left in your dataset. By default, the names of the new dummy variables start with _I; therefore, you can drop them by typing "drop _I*". You do not have to do this; each time you use xi, any previously created automatically generated dummies with the same prefix as the one specified in the prefix() option (_I by default) are dropped and new ones are created.

o The new dummy variables have variable labels so you can determine what they correspond to by typing "describe" or "describe _I*"; see [D] describe.

o xi may be used with any Stata command (not just logistic).

Summary of controlling the omitted dummy

i.varname omits the first group by default but if you define

char _dta[omit] "prevalent"

then the default behavior changes to that of dropping the most prevalent group. You can restore the default behavior by typing

char _dta[omit]

Either way, if you define a variable characteristic of the form

char varname[omit] #

or, if varname is a string,

char varname[omit] "string_literal"

then the specified value will be omitted.

Examples: . char agegrp[omit] 1 . char race[omit] "White" (for race a string variable) . char agegrp[omit] (to restore default)

Interpreting output

. xi: regress mpg i.rep78 i.rep78 _Irep78_1-5 (naturally coded; _Irep78_1 omitted) (output from regress appears)

Interpretation: i.rep78 expanded to the dummies _Irep78_1, _Irep78_2, ..., _Irep78_5. The numbers on the end are "naturally" coded in the sense that _Irep78_1 corresponds to rep78==1, _Irep78_2 to rep78==2, etc. Finally, the dummy for rep78==1 was omitted.

. xi: regress mpg i.make i.make _Imake_1-74 (_Imake_1 for make==AMC Concord omitted) (output from regress appears)

Interpretation: i.make expanded to _Imake_1, _Imake_2, ..., _Imake_74. The coding is not natural because make is a string variable. _Imake_1 corresponds to one make, _Imake_2 another, and so on. We can find out the coding by typing "describe". _Imake_1 for the AMC Concord was chosen to be omitted.

How xi names variables

The names xi assigns to the dummy variables it creates are of the form

<prefix><stub>_<groupid>

By default, the prefix is _I:

_I<stub>_<groupid>

You may subsequently refer to the entire set of variables by <prefix><stub>*.

For example:

name = _I + <stub> + _ + <groupid> Entire set -------------------------------------------------------------- _Iagegrp_1 _I agegrp _ 1 _Iagegrp* _Iagegrp_2 _I agegrp _ 2 _Iagegrp* _IageXwgt_1 _I ageXwgt _ 1 _IageXwgt* _IageXrac_1_2 _I ageXrac _ 1_2 _IageXrac* _IageXrac_2_1 _I ageXrac _ 2_1 _IageXrac*

xi as a command rather than a command prefix

xi can be used as a command prefix or as a command by itself. In the latter form, xi merely creates the indicator and interaction variables. Equivalent to typing

. xi: regress y i.agegrp*wgt

is

. xi i.agegrp*wgt i.agegrp _Iagegrp_1-4 (naturally coded; Iagegrp_1 omitted) i.agegrp*wgt _IageXwgt_1-4 (coded as above)

. regress y _Iagegrp* _IageXwgt*

Warnings

- xi creates new variables in your data; most are bytes, but interactions with continuous variables will have the storage type of the underlying continuous variable; see [D] data types.

- when using xi with an estimation command, you may get the message "matsize too small". If so, see [R] matsize.

Examples

. xi: logistic outcome weight i.agegrp bp . xi: logistic outcome weight bp i.agegrp i.race . xi: logistic outcome weight bp i.agegrp*i.race . xi: logistic outcome bp i.agegrp*weight i.race . xi: logistic outcome bp i.agegrp|weight i.race . xi: logistic outcome bp i.agegrp*weight i.agegrp*i.race . xi, prefix(_S) : logistic outcome weight i.agegrp bp

Stored results

xi stores the following characteristics:

_dta[__xi__Vars__Prefix__] prefix names _dta[__xi__Vars__To__Drop__] variables created


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index