Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Modelling of categorical-continuous variable interaction

From	Maarten Buis <[email protected]>
To	[email protected]
Subject	Re: st: Modelling of categorical-continuous variable interaction
Date	Mon, 1 Jul 2013 17:49:47 +0200

On Mon, Jul 1, 2013 at 4:35 PM, Daniel Yue wrote:
> a) Is there a quick way to explain why "it is not recommended" to leave out the main effects?

Consider the case when you have 1 binary (D) and 1 continuous (X) variable.

*** When you specify -reg y i.D#c.X- you estimate the following model:

y = b0 + b1(D==0)*X + b2 (D==1)*X

So when D==0 the regression line is

y = b0 + b1*X

when D==1 the regression line is:

y = b0 + b2*X

*** When you specify -reg y i.D i.D#c.X- you estimate the following model:

y = b0 + b1(D==0)*X + b2 (D==1)*X + b3D

So when D==0 the regression line is

y = b0 + b1*X

when D==1 the regression line is:

y = (b0 + b3) + b2*X

*** When you specify -reg y i.D##c.X- or -reg y i.D X i.D#c.X- you
estimate the following model:

y = b0 + b1X + b2 D*X + b3D

So when D==0 the regression line is

y = b0 + b1*X

when D==1 the regression line is:

y = (b0 + b3) + (b1 + b2)*X

You can see that the last two models are just different ways of saying
the same thing, but the first one is a different model.

The key difference is that in the first model you are forcing the
intercepts of the two curves to be the same. The meaning of the
intercept depends on where you center X. In most situations that I am
familiar with, where you center X is arbitrary: you could center at
the mean, the median, the minum or some substantively meaninful value
within or near the range of the data. This is the logic behind the
statement that you almost never want to exclude the main effects.

> b) regarding question 3: should that model
>
> by I: xtreg y x1 x2 x3, fe
>
> give the same/equivalent results, if specified correctly?

In simple linear regression you would expect the same point estimates
but different standard errors, as in the separate models you allow the
residual variance to differ accros groups. I suspect that this result
generalizes to fixed effects regression, but this is only my
intuition. There are others on this list that are more knowledgable.

-- Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: FW: query on merging
  - From: Amal Khanolkar <[email protected]>
- st: Modelling of categorical-continuous variable interaction
  - From: Daniel Yue <[email protected]>
- Re: st: Modelling of categorical-continuous variable interaction
  - From: Maarten Buis <[email protected]>
- RE: st: Modelling of categorical-continuous variable interaction
  - From: Daniel Yue <[email protected]>

Prev by Date: Re: st: RE: format each label on axis individually
Next by Date: Re: R: st: Population attributable fractions (PAFs) in discrete-time survival analysis. -punaf-
Previous by thread: RE: st: Modelling of categorical-continuous variable interaction
Next by thread: [no subject]
Index(es):
- Date
- Thread