Stata 11 help for xi

help xi dialog: xi -------------------------------------------------------------------------------

Title

[R] xi -- Interaction expansion

Syntax

xi [, prefix(string) noomit] term(s)

xi [, prefix(string) noomit] : any_stata_command varlist_with_terms ...

where a term has the form

i.varname or I.varname i.varname1*i.varname2 I.varname1*I.varname2 i.varname1*varname3 I.varname1*varname3 i.varname1|varname3 I.varname1|varname3

varname, varname1, and varname2 denote numeric or string categorical variables. varname3 denotes a continuous, numeric variable.

Menu

Data > Create or change data > Other variable-creation commands > Interaction expansion

+-----------------------------------------------------------------+ | Most commands in Stata now allow factor variables; see | | fvvarlist. To determine if a command allows factor variables, | | see the information printed below the options table for the | | command. If the command allows factor variables, it will say | | something like "indepvars may contain factor variables." | | | | We recommend that you use factor variables instead of xi if a | | command allows factor variables. | | | | We include [R] xi in our documentation so that readers can | | consult it when using a Stata command that does not allow | | factor variables. | +-----------------------------------------------------------------+

Description

xi expands terms containing categorical variables into indicator (also called dummy) variable sets by creating new variables, and, in the second syntax (xi: any_stata_command), executes the specified command with the expanded terms. The dummy variables created are

i.varname creates dummies for categorical variable varname.

i.varname1*i.varname2 creates dummies for categorical variables varname1 and varname2: all interactions and main effects

i.varname1*varname3 creates dummies for categorical variable varname1 and continuous variable varname3: all interactions and main effects.

i.varname1|varname3 creates dummies for categorical variable varname1 and continuous variable varname3: all interactions and main effect of varname3, but no main effect of varname1.

Options

prefix(string) allows you to choose a prefix other than _I for the newly created interaction variables. The prefix cannot be longer than four characters. By default, xi will create interaction variables starting with _I. When you use xi, it drops all previously created interaction variables starting with the prefix specified in the prefix(string) option or with _I by default. Therefore, if you want to keep the variables with a certain prefix, specify a different prefix in the prefix(string) option.

noomit prevents xi from omitting groups. This option provides a way to generate an indicator variable for every category of one or more variables, which is useful when combined with the noconstant option of an estimation command.

Examples

. xi: logistic outcome weight i.agegrp bp . xi: logistic outcome weight bp i.agegrp i.race . xi: logistic outcome weight bp i.agegrp*i.race . xi: logistic outcome bp i.agegrp*weight i.race . xi: logistic outcome bp i.agegrp|weight i.race . xi: logistic outcome bp i.agegrp*weight i.agegrp*i.race . xi, prefix(_S) : logistic outcome weight i.agegrp bp

Summary of i.varname

o varname may be string or numeric.

o Indicator (dummy) variables are created automatically.

o By default, the dummy-variable set is identified by dropping the dummy corresponding to the smallest value of the variable (how to specify otherwise is discussed below).

o The new dummy variables are left in your dataset. By default, the names of the new dummy variables start with _I; therefore, you can drop them by typing "drop _I*". You do not have to do this; each time you use xi, any previously created automatically generated dummies with the same prefix as the one specified in the prefix() option (_I by default) are dropped and new ones are created.

o The new dummy variables have variable labels so you can determine what they correspond to by typing "describe" or "describe _I*"; see [D] describe.

o xi may be used with any Stata command (not just logistic).

Summary of controlling the omitted dummy

i.varname omits the first group by default but if you define

char _dta[omit] "prevalent"

then the default behavior changes to that of dropping the most prevalent group. You can restore the default behavior by typing

char _dta[omit]

Either way, if you define a variable characteristic of the form

char varname[omit] #

or, if varname is a string,

char varname[omit] "string_literal"

then the specified value will be omitted.

Examples: . char agegrp[omit] 1 . char race[omit] "White" (for race a string variable) . char agegrp[omit] (to restore default)

Interpreting output

. xi: regress mpg i.rep78 i.rep78 _Irep78_1-5 (naturally coded; _Irep78_1 omitted) (output from regress appears)

Interpretation: i.rep78 expanded to the dummies _Irep78_1, _Irep78_2, ..., _Irep78_5. The numbers on the end are "naturally" coded in the sense that _Irep78_1 corresponds to rep78==1, _Irep78_2 to rep78==2, etc. Finally, the dummy for rep78==1 was omitted.

. xi: regress mpg i.make i.make _Imake_1-74 (_Imake_1 for make==AMC Concord omitted) (output from regress appears)

Interpretation: i.make expanded to _Imake_1, _Imake_2, ..., _Imake_74. The coding is not natural because make is a string variable. _Imake_1 corresponds to one make, _Imake_2 another, and so on. We can find out the coding by typing "describe". _Imake_1 for the AMC Concord was chosen to be omitted.

How xi names variables

The names xi assigns to the dummy variables it creates are of the form

<prefix><stub>_<groupid>

By default, the prefix is _I:

_I<stub>_<groupid>

You may subsequently refer to the entire set of variables by <prefix><stub>*.

For example:

name = _I + <stub> + _ + <groupid> Entire set -------------------------------------------------------------- _Iagegrp_1 _I agegrp _ 1 _Iagegrp* _Iagegrp_2 _I agegrp _ 2 _Iagegrp* _IageXwgt_1 _I ageXwgt _ 1 _IageXwgt* _IageXrac_1_2 _I ageXrac _ 1_2 _IageXrac* _IageXrac_2_1 _I ageXrac _ 2_1 _IageXrac*

xi as a command rather than a command prefix

xi can be used as a command prefix or as a command by itself. In the latter form, xi merely creates the indicator and interaction variables. Equivalent to typing

. xi: regress y i.agegrp*wgt

is

. xi i.agegrp*wgt i.agegrp _Iagegrp_1-4 (naturally coded; Iagegrp_1 omitted) i.agegrp*wgt _IageXwgt_1-4 (coded as above)

. regress y _Iagegrp* _IageXwgt*

Warnings

- xi creates new variables in your data; most are bytes, but interactions with continuous variables will have the storage type of the underlying continuous variable; see [D] data types.

- when using xi with an estimation command, you may get the message "matsize too small". If so, see [R] matsize.

Saved results

xi saves the following characteristics:

_dta[__xi__Vars__Prefix__] prefix names _dta[__xi__Vars__To__Drop__] variables created

Also see

Manual: [R] xi

Help: [U] 11.1.10 Prefix commands, [U] 20 Estimation and postestimation commands (estimation), [U] 20 Estimation and postestimation commands (postestimation)


© Copyright 1996–2009 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index