Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Standardized interaction terms - which p-values hold?

From	Elisabeth Bublitz <[email protected]>
To	[email protected]
Subject	Re: st: Standardized interaction terms - which p-values hold?
Date	Tue, 15 Jan 2013 17:59:43 +0100

Thanks for you answers! I do agree that it does not make sense tostandardize the variables twice. In fact, using Example 1 this would be:


*********
reg smpg shead slength ia2 // (A) new suggestion

reg smpg shead slength sia2 // previous suggestion (standardizedinteraction)


*********

It shows that it makes no difference what you do once you standardizethe variables before forming an interaction. They yield identical results.

This leaves one more question open: How should you handle changingp-values? The so far prefered option (A) returns something differentthan a regression with unstandardized variables (B).

From what I know, a standardization of variables should only changecoefficients not p-values. To get identical p-values as in the baselineregression (with unstandardized variables) it is necessary to leavevariables unstandardized before creating the interaction. If I thenstandardize the interaction term, it gives identical p-values (C).


For illustration I follow the examples from before again:

**********
egen sia1 = std(ia)
reg mpg head length ia  // (B) baseline

reg smpg shead slength sia1 // (C) same p-values with standardizedinteraction but unstandardized variables

**********

Obviously, the interpretation in the examples does not makes sense butthey just serve as an illustration. But in general I'm wondering whatmight be going on.


-Elisabeth

Am 15.01.2013 17:24, schrieb Jeffrey Wooldridge:

For what it's worth, I agree with Joerg. I don't see that
standardizing the interaction makes sense; nor does it solve a
substantive problem. Centering the variables before interacting them
often does, but that's because it forces the coefficients on the level
variables to be interpreted as marginal effects at the means of the
covariates. This often does make more sense than the partial effects
at zero. For example, what sense would it make to estimate the effects
of headroom on mpg for a car with length = 0?

In your example, I assume the variables are rates at something like
the county level. But it still would make no sense to evaluate the
partial effect of death -- whether it is standardized or not -- at
medage = 0.

On Tue, Jan 15, 2013 at 11:13 AM, Joerg Luedicke
<[email protected]> wrote:

In your two examples, you are comparing apples and oranges. If you
center your variables in example 1 such that their mean is zero, you
should get the same results as in example 2. However, I would not
standardize the interaction term itself because it does not seem to be
very meaningful. If the two predictors are standardized, then their
interaction shows the effect of one predictor on the effect of the
other in standard deviation unit. If the interaction term itself is
standardized (or if you calculate a standardized coefficient) you
can't interpret it that way.

Joerg

On Tue, Jan 15, 2013 at 10:01 AM, Elisabeth Bublitz
<[email protected]> wrote:

Hi Statalist,

when I compare the p-values of a baseline regression with those obtained
from a regression with standardized coefficients and interaction terms the
following problem comes up: The suggestions previously posted (see,
http://www.stata.com/statalist/archive/2009-04/msg00888.html) are that the
variables forming the interaction need to be standardized before they are
interacted, and a second time afterwards. This changes the p-values and
sometimes even coefficients change their signs. Intuitively this suggests to
me that something with the previous suggestion is not correct.


Here is the example from the previous thread:
*-------------------Example 1--------------------------------
* This version standardizes the IA once and serves as an example of what is
"incorrect"
sysuse auto, clear
gen ia = head*length
reg mpg head length ia, beta

* This version standardizes the IA twice and is suggested to be "correct"
egen shead = std(headroom)
egen slength = std(length)
egen smpg = std(mpg)
gen ia2 = shead*slength
egen sia2 = std(ia2)
reg smpg shead slength sia2
*-----------------------------------------------------------------


In this example the changes are visible but do not yet cross important
levels, therefore significance levels stay the same. This is, however,
different for the data I use. I'd be curious to learn what you think about
this.

I found an example where the changes are more visible.

*-------------------Example 2--------------------------------
sysuse census, clear

* Standardizing coefficients
egen zdivorce = std(divorce)
egen zmarriage = std(marriage)
egen zdeath = std(death)
egen zmedage = std(medage)

* Interaction terms
gen ia= death*medage
egen zia_1= std(ia)
gen test = zdeath*zmedage
egen zia_2 = std(test)

* Regression
reg divorce marriage death medage ia, beta //(1) this follows the simpler
procedure
reg divorce marriage death medage test, beta //(2) this standardizes the IA
twice, note changes in significance levels and coefficient size
reg zdivorce zmarriage zdeath zmedage zia_2 // (3) for comparison (identical
with (2)): this is the same as suggested in the previous thread
*-----------------------------------------------------------------

Unfortunately, I need to compare the size of two interactions and, thus,
need standardized coefficients. If you have other suggestions, let me know.
I was wondering whether it would make sense to use logarithms instead.

Many thanks!
Elisabeth


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Standardized interaction terms - which p-values hold?
  - From: Maarten Buis <[email protected]>
- Re: st: Standardized interaction terms - which p-values hold?
  - From: Nick Cox <[email protected]>

References:
- st: Standardized interaction terms - which p-values hold?
  - From: Elisabeth Bublitz <[email protected]>
- Re: st: Standardized interaction terms - which p-values hold?
  - From: Joerg Luedicke <[email protected]>
- Re: st: Standardized interaction terms - which p-values hold?
  - From: Jeffrey Wooldridge <[email protected]>

Prev by Date: st: Simple evaluation of lf evaluators in mata...
Next by Date: Re: st: Standardized interaction terms - which p-values hold?
Previous by thread: Re: st: Standardized interaction terms - which p-values hold?
Next by thread: Re: st: Standardized interaction terms - which p-values hold?
Index(es):
- Date
- Thread