Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Fixed Effects inconsistency between Correlation and Coefficient Direction

From   "Nick Cox" <>
To   <>
Subject   RE: st: Fixed Effects inconsistency between Correlation and Coefficient Direction
Date   Mon, 19 Apr 2010 17:31:48 +0100

Steve's very clear example could be supplemented by 

scatter y x, mla(id) mlabpos(0) ms(i)

In other terminology, this is an example of an amalgamation paradox often named for E.H. Simpson (more rarely for G.U. Yule, who was there earlier (even more rarely for Karl Pearson, who was there even earlier ...)). 


Steve Samuels

Here's a data set that qualitatively reproduces the phenomenon you
describe. Note the relatively large between-id variation compared to
within-id variation.  I don't understand your statement about dropping
data.  Please provide a reference.

**************************CODE BEGINS**************************
input  id x
  1     1
  1     2
  1     3
  2     4
  2     5
  2     6
  3     7
  3     8
  3     9
set seed 123456
gen y = 10*id -x + rnormal(0,1)
xtset id
corr y x
xtreg y x, fe
xtreg y x, re
***************************CODE ENDS***************************

On Sun, Apr 18, 2010 at 1:42 PM, MICHAEL ESPOSITO <> wrote:

> I have a question that I cannot seem to find an answer to. I am attempting
> to use the fixed effects model for research that I am conducting for my
> dissertation. My committee and I discovered that in certain circumstances
> the results do not seem logical. For instance, the correlation matrix
> indicates a positive relationship between two variables and then when we run
> the Fixed Effects Linear Regression model using the same two variables, the
> coefficient indicates a negative relationship.  I suspect that it may be
> related to something I read that stated that the fixed effects model has the
> tendency to drop a significant amount of data in the independent variable
> when the data is perceived as having a high degree of randomness.
> The correlation matrix suggests a positive relationship .2663 and the
> coefficient correlation indicates a negative -1491.  When I run the same
> variables using the linear regression model with the Mixed Effects
> variation, all findings suggest a positive relationship. Does anyone know
> what could be causing this strange occurrence? Any advice or guidance you
> can provide would be most appreciated.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index