Home  /  Resources & support  /  Users Group meetings  /  2003 North American Stata Users Group meeting

Extending xi

Speakers  Michael N. Mitchell, UCLA Academic Technology Services
Phil Ender, UCLA Department of Education
Date  NASUG 2003

xi3 is an extension of xi that can use alternative coding systems, including simple coding, deviation (effect) coding, Helmert coding, reverse Helmert coding, forward differences, backward differences, orthogonal polynomial contrasts and user defined coding in addition to indicator (dummy) coding.

Syntactically, xi3 is very similar to xi. It uses different prefixes to indicate the various coding schemes. It allows for three-way interactions and interactions that include continuous variables.

We begin with a model that students learning to use xi might try...

        . use http://www.gseis.ucla.edu/courses/data/hsb2
        
        . xi: regress write i.prog i.female
and they quickly progress to this...
        . xi: regress write i.prog*i.female
        
        . test _IproXfem_2_1 _IproXfem_3_1
        
        . test _Iprog_2 _Iprog_3
        
        . test _Ifemale_1
While the interaction is correctly found to be not significant the main effects are not correctly estimated. Of course, it is possible to obtain the correct estimate of the effects with a more complicated version of the test command. For example, the main effect for female is...

        . test _Ifemale_1 + 1/3*_IproXfem_2_1 + 1/3*_IproXfem_3_1 = 0
The same results can be obtained using xi3 with deviation (effect) coding.

        . xi3: regress write d.prog*d.female
        
        . test _Ipr2Xfe1 _Ipr3Xfe1
        
        . test _Iprog_2 _Iprog_3
        
        . test _Ifemale_1
        
        . describe _Iprog_2 _Iprog_3 _Ifemale_1
Now both the interactions and main effects are correctly estimated.

Variables appear only once in the results even if they appear more than once in a model.

        . xi3: regress write d.prog*d.female d.prog*d.ses
xi3 allows three-way interactions with categorical variables.
        . xi3: regress write d.prog*d.ses*d.female
Interactions can include continuous variables. The continuous variables can be placed in any position including the first position.

        . xi3: regress write read*d.prog*d.female
The interactions do not have to include categorical variables, they can be made up of all continuous variables.

        . xi3: regress write read*math*science
Switching to a dataset from the 1st edition of Kirk (1968), we will demonstrate some of the other features of xi3.

        . use http://www.gseis.ucla.edu/courses/data/crf24
We will use simple coding for variable a and Helmert coding for variable b.
        . xi3: regress y s.a*h.b

        . describe _Ia_2 _Ib_1 _Ib_2 _Ib_3
Next, we will use the orthogonal polynomial coding for variable b.

        . xi3: regress y s.a o.b

        . describe _Ia_2 _Ib_1 _Ib_2 _Ib_3
Using the @ operator, we can obtain the trend effects of variable b at each level of variable a.

        . xi3: regress y [email protected]

        . describe _Ib1Wa1 _Ib1Wa2 _Ib2Wa1 _Ib2Wa2 _Ib3Wa1 _Ib3Wa2
We then follow this up with a test of simple main effects of variable b at a=1.

        . test _Ib1Wa1 _Ib2Wa1 _Ib3Wa1
Stata's char command is used for user defined coding.

        . char b[user] (1 1 -1 -1 \ 1 -1 0 0 \ 0 0 1 -1)

        . xi3: regress y s.a*u.b

        . describe _Ib_1 _Ib_2 _Ib_3
Continuing concerns:
  • does not handle @ for 3 variables
  • freaks out if length of terms exceeds 32
If you are interested in xi3, you know where to find it.

        . findit xi3
Here is the complete list of coding schemes and their prefixes:

     i.varname - Indicator (dummy) coding: compares each level to the omitted group
     c.varname - Centered indicator (dummy) coding
     s.varname - Simple coding: compares each level to a reference level
     d.varname - Deviation coding: deviations from the grand mean
     h.varname - Helmert coding: compares levels of a variable with the mean of subsequent levels
     r.varname - Reverse Helmert coding, compares levels of a variable with the mean of previous levels
     f.varname - Forward differences: adjacent levels, each vs. next
     b.varname - Backward differences: adjacent levels, each vs. previous
     o.varname - Orthogonal polynomial contrasts
     u.varname - User defined coding scheme