# st: RE: -xi- and interactions (was:your stata query)

 From "Scott Merryman" To Subject st: RE: -xi- and interactions (was:your stata query) Date Wed, 22 Jun 2005 19:26:28 -0500

```I am not sure this is correct.   In the 2 variable cross product case, the
variable T would be equivalent to T1, variable A would be equal to A1, and
the cross product TxA would be equal to T1A1.  With these three coefficients
and the constant all 4 combinations can be accounted for.

T = 0, A = 0 would be the constant
T = 0, A = 1 would be A plus the constant
T = 1, A = 0 would be T plus the constant
T = 1, A = 1 would be TXA plus the constant

For the 3 variable case, in the program below, the reference category for
the cross product method is T = 0, A = 0, and B= 0; which will be equal to
the constant.  At the end of the program, a table is produced showing all 8
combinations of the categorical variables for both the "all dummy" and
"cross product" method.  The results are identical for both methods.

+---------------------------------------------------+
|          categories   all_dummies   cross_product |
|---------------------------------------------------|
| T = 0, A = 0, B = 0      3424.737        3424.737 |
| T = 0, A = 0, B = 1      4062.308        4062.308 |
| T = 0, A = 1, B = 0        2730.5          2730.5 |
| T = 0, A = 1, B = 1      3368.071        3368.071 |
|---------------------------------------------------|
| T = 1, A = 0, B = 0          2372            2372 |
| T = 1, A = 0, B = 1          3420            3420 |
| T = 1, A = 1, B = 0      1991.667        1991.667 |
| T = 1, A = 1, B = 1      3039.667        3039.667 |
+---------------------------------------------------+

Scott

----------------------------------------------------------
sysuse auto, clear
qui {
gen t = fore
mark a if price <4500
mark b if mpg <17
tab t, gen(t)
tab a, gen(a)
tab b , gen(b)

gen t0a0 = t1*a1
gen t1a0  =t2*a1
gen t0a1 = t1*a2
gen t1a1 = t2*a2

gen t0b0= t1*b1
gen t1b0 = t2*b1
gen t0b1 = t1*b2
gen t1b1 = t2*b2

xi i.t*i.a i.t*i.b

gen all_dummies = .
local n = 1
forv i = 0/1 {
forv j = 0/1 {
forv k = 0/1 {
replace all_dummies = _b[t`i'a`j'] + _b[t`i'b`k'] in `n'
local n = `n' + 1
}
}
}

gen categories = ""
local n = 1
forv i = 0/1 {
forv j = 0/1 {
forv k = 0/1 {
replace cate = "T = `i', A = `j', B = `k'" in `n'
local n = `n' + 1
}
}
}

gen cross_product = .
replace cross_product = _b[_cons] in 1
replace cross_product = _b[_cons] + _b[_Ib_1] in 2
replace cross_product = _b[_cons] + _b[_Ia_1] in 3
replace cross_product = _b[_cons] + _b[_Ib_1]+ _b[_Ia_1] in 4
replace cross_product = _b[_cons] + _b[_It_1]  in 5
replace cross_product = _b[_cons] + _b[_It_1]+ _b[_Ib_1] +_b[_ItXb_1_1] in 6

replace cross_product = _b[_cons] + _b[_It_1]+ _b[_Ia_1] +_b[_ItXa_1_1] in 7

replace cross_product = _b[_cons] + _b[_It_1]+ _b[_Ia_1]+ _b[_Ib_1] ///
+ _b[_ItXa_1_1] + _b[_ItXb_1_1] in 8

}

l cate all_dummies cross_product in 1/8, noobs abb(32) sep(4)

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
> statalist@hsphsun2.harvard.edu] On Behalf Of "Laplante, Benoît"
> Sent: Thursday, June 16, 2005 3:22 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: -xi- and interactions (was:your stata query)
>
> The recent postings on -xi- reminded me of the following riddle about
> interactions.
>
> Say you have two variables, T and A. Each has two values, low and high,
> coded 0 and 1 respectively and for both variables. You assume that the
> effect of A varies across the values of T, which is a very basic
> definition of what an interaction is about. There are (at least) two ways
> to deal with the problem.
>
> The first one is simply to build dummies that represent all the
> combinations of values of the two variables and put all of them, minus
> one, in the equation. So you would have T0A0, T0A1, T1A0 and T1A1. The
> most obvious choice would be to exclude T0A0, but excluding any of the
> four variables would provide equivalent results. So you would use three
> variables to represent your interaction, say T0A1, T1A0 and T1A1.
>
> Another method, the one used by -xi-is to compute cross products of the
> two variables. The procedure gives you an equation in which you are using
> three variables, two of them labelled as the original variables and the
> third one as their product. So you are using variables labelled T, A and
> TA. Given the original coding scheme and the use of a cross-product, T is
> actually equivalent to T1A0, A to T0A1 and TA to T1A1-(T0A1+T1A0).
>
> Both methods will produce equivalent results.
>
> Now let's say that you are adding a second variable to your equation, B,
> whose effect is also assumed to vary across the values of T. Using the
> first method, you would build an equation containing 6 terms, that is
> three out of T0A0, T0A1, T1A0 and T1B1, and three out of T0B0, T0B1, T1B0
> and T1B1.
>
> If you were to use the second method, you would be using 5 terms: T, A,
> TA, B, TB. The fit of the two models are not the same and I never found a
> reference that dealt with how two such different models could be
> equivalent.
>
> Actually, the second one seems to be equivalent to an equation in which,
> using T0A0 and T0B0 as reference categories, T=(T1A0+T1B0)/2, which would
> be an unwanted assumption in most cases.
>
>
> Benoît Laplante, professeur
> Université du Québec
> Institut National de la Recherche Scientifique
> Urbanisation, Culture et Société
> http://www.inrs-ucs.uquebec.ca/default.asp?p=lapl
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```