Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: Fixed Effects inconsistency between Correlation and Coefficient Direction

 From "Nick Cox" To Subject RE: st: Fixed Effects inconsistency between Correlation and Coefficient Direction Date Mon, 19 Apr 2010 17:31:48 +0100

```Steve's very clear example could be supplemented by

scatter y x, mla(id) mlabpos(0) ms(i)

In other terminology, this is an example of an amalgamation paradox often named for E.H. Simpson (more rarely for G.U. Yule, who was there earlier (even more rarely for Karl Pearson, who was there even earlier ...)).

Nick
n.j.cox@durham.ac.uk

Steve Samuels

Here's a data set that qualitatively reproduces the phenomenon you
describe. Note the relatively large between-id variation compared to
within-id variation.  I don't understand your statement about dropping
data.  Please provide a reference.

Steve
**************************CODE BEGINS**************************
clear
input  id x
1     1
1     2
1     3
2     4
2     5
2     6
3     7
3     8
3     9
end
set seed 123456
gen y = 10*id -x + rnormal(0,1)
xtset id
list
corr y x
xtreg y x, fe
xtreg y x, re
***************************CODE ENDS***************************

On Sun, Apr 18, 2010 at 1:42 PM, MICHAEL ESPOSITO <mespo12@optonline.net> wrote:

> I have a question that I cannot seem to find an answer to. I am attempting
> to use the fixed effects model for research that I am conducting for my
> dissertation. My committee and I discovered that in certain circumstances
> the results do not seem logical. For instance, the correlation matrix
> indicates a positive relationship between two variables and then when we run
> the Fixed Effects Linear Regression model using the same two variables, the
> coefficient indicates a negative relationship.  I suspect that it may be
> related to something I read that stated that the fixed effects model has the
> tendency to drop a significant amount of data in the independent variable
> when the data is perceived as having a high degree of randomness.
>
> The correlation matrix suggests a positive relationship .2663 and the
> coefficient correlation indicates a negative -1491.  When I run the same
> variables using the linear regression model with the Mixed Effects
> variation, all findings suggest a positive relationship. Does anyone know
> what could be causing this strange occurrence? Any advice or guidance you
> can provide would be most appreciated.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```