help xi dialog: xi
-------------------------------------------------------------------------------
Title
[R] xi -- Interaction expansion
Syntax
xi [, prefix(string) noomit] term(s)
xi [, prefix(string) noomit] : any_stata_command varlist_with_terms
...
where a term has the form
i.varname or I.varname
i.varname1*i.varname2 I.varname1*I.varname2
i.varname1*varname3 I.varname1*varname3
i.varname1|varname3 I.varname1|varname3
varname, varname1, and varname2 denote numeric or string categorical
variables. varname3 denotes a continuous, numeric variable.
Menu
Data > Create or change data > Other variable-creation commands >
Interaction expansion
+-----------------------------------------------------------------+
| Most commands in Stata now allow factor variables; see |
| fvvarlist. To determine if a command allows factor variables, |
| see the information printed below the options table for the |
| command. If the command allows factor variables, it will say |
| something like "indepvars may contain factor variables." |
| |
| We recommend that you use factor variables instead of xi if a |
| command allows factor variables. |
| |
| We include [R] xi in our documentation so that readers can |
| consult it when using a Stata command that does not allow |
| factor variables. |
+-----------------------------------------------------------------+
Description
xi expands terms containing categorical variables into indicator (also
called dummy) variable sets by creating new variables, and, in the second
syntax (xi: any_stata_command), executes the specified command with the
expanded terms. The dummy variables created are
i.varname creates dummies for categorical variable
varname.
i.varname1*i.varname2 creates dummies for categorical variables
varname1 and varname2: all interactions and
main effects
i.varname1*varname3 creates dummies for categorical variable
varname1 and continuous variable varname3:
all interactions and main effects.
i.varname1|varname3 creates dummies for categorical variable
varname1 and continuous variable varname3:
all interactions and main effect of varname3,
but no main effect of varname1.
Options
prefix(string) allows you to choose a prefix other than _I for the newly
created interaction variables. The prefix cannot be longer than four
characters. By default, xi will create interaction variables
starting with _I. When you use xi, it drops all previously created
interaction variables starting with the prefix specified in the
prefix(string) option or with _I by default. Therefore, if you want
to keep the variables with a certain prefix, specify a different
prefix in the prefix(string) option.
noomit prevents xi from omitting groups. This option provides a way to
generate an indicator variable for every category of one or more
variables, which is useful when combined with the noconstant option
of an estimation command.
Examples
. xi: logistic outcome weight i.agegrp bp
. xi: logistic outcome weight bp i.agegrp i.race
. xi: logistic outcome weight bp i.agegrp*i.race
. xi: logistic outcome bp i.agegrp*weight i.race
. xi: logistic outcome bp i.agegrp|weight i.race
. xi: logistic outcome bp i.agegrp*weight i.agegrp*i.race
. xi, prefix(_S) : logistic outcome weight i.agegrp bp
Summary of i.varname
o varname may be string or numeric.
o Indicator (dummy) variables are created automatically.
o By default, the dummy-variable set is identified by dropping the dummy
corresponding to the smallest value of the variable (how to specify
otherwise is discussed below).
o The new dummy variables are left in your dataset. By default, the
names of the new dummy variables start with _I; therefore, you can
drop them by typing "drop _I*". You do not have to do this; each
time you use xi, any previously created automatically generated
dummies with the same prefix as the one specified in the prefix()
option (_I by default) are dropped and new ones are created.
o The new dummy variables have variable labels so you can determine what
they correspond to by typing "describe" or "describe _I*"; see [D]
describe.
o xi may be used with any Stata command (not just logistic).
Summary of controlling the omitted dummy
i.varname omits the first group by default but if you define
char _dta[omit] "prevalent"
then the default behavior changes to that of dropping the most prevalent
group. You can restore the default behavior by typing
char _dta[omit]
Either way, if you define a variable characteristic of the form
char varname[omit] #
or, if varname is a string,
char varname[omit] "string_literal"
then the specified value will be omitted.
Examples:
. char agegrp[omit] 1
. char race[omit] "White" (for race a string variable)
. char agegrp[omit] (to restore default)
Interpreting output
. xi: regress mpg i.rep78
i.rep78 _Irep78_1-5 (naturally coded; _Irep78_1 omitted)
(output from regress appears)
Interpretation: i.rep78 expanded to the dummies _Irep78_1, _Irep78_2,
..., _Irep78_5. The numbers on the end are "naturally" coded in the
sense that _Irep78_1 corresponds to rep78==1, _Irep78_2 to rep78==2, etc.
Finally, the dummy for rep78==1 was omitted.
. xi: regress mpg i.make
i.make _Imake_1-74 (_Imake_1 for make==AMC Concord omitted)
(output from regress appears)
Interpretation: i.make expanded to _Imake_1, _Imake_2, ..., _Imake_74.
The coding is not natural because make is a string variable. _Imake_1
corresponds to one make, _Imake_2 another, and so on. We can find out
the coding by typing "describe". _Imake_1 for the AMC Concord was chosen
to be omitted.
How xi names variables
The names xi assigns to the dummy variables it creates are of the form
<prefix><stub>_<groupid>
By default, the prefix is _I:
_I<stub>_<groupid>
You may subsequently refer to the entire set of variables by
<prefix><stub>*.
For example:
name = _I + <stub> + _ + <groupid> Entire set
--------------------------------------------------------------
_Iagegrp_1 _I agegrp _ 1 _Iagegrp*
_Iagegrp_2 _I agegrp _ 2 _Iagegrp*
_IageXwgt_1 _I ageXwgt _ 1 _IageXwgt*
_IageXrac_1_2 _I ageXrac _ 1_2 _IageXrac*
_IageXrac_2_1 _I ageXrac _ 2_1 _IageXrac*
xi as a command rather than a command prefix
xi can be used as a command prefix or as a command by itself. In the
latter form, xi merely creates the indicator and interaction variables.
Equivalent to typing
. xi: regress y i.agegrp*wgt
is
. xi i.agegrp*wgt
i.agegrp _Iagegrp_1-4 (naturally coded; Iagegrp_1 omitted)
i.agegrp*wgt _IageXwgt_1-4 (coded as above)
. regress y _Iagegrp* _IageXwgt*
Warnings
- xi creates new variables in your data; most are bytes, but interactions
with continuous variables will have the storage type of the underlying
continuous variable; see [D] data types.
- when using xi with an estimation command, you may get the message
"matsize too small". If so, see [R] matsize.
Saved results
xi saves the following characteristics:
_dta[__xi__Vars__Prefix__] prefix names
_dta[__xi__Vars__To__Drop__] variables created
Also see
Manual: [R] xi
Help: [U] 11.1.10 Prefix commands,
[U] 20 Estimation and postestimation commands (estimation),
[U] 20 Estimation and postestimation commands (postestimation)