FAQ: Interpreting the intercept in the fixed-effects model

Home / Resources & support / FAQs / Interpreting the intercept in the fixed-effects model

How can there be an intercept in the fixed-effects model estimated by xtreg, fe?

Title		Interpreting the intercept in the fixed-effects model
Author		William Gould, StataCorp

The results that xtreg, fe reports have simply been reformulated so that the reported intercept is the average value of the fixed effects.

Intuition

One way of writing the fixed-effects model is

$y_{it} = a + x_{it}b + v_i + e_{it} \qquad\qquad\qquad\qquad\qquad\qquad \text{(1)}$

where $v_i (i=1, ..., n)$ are simply the fixed effects to be estimated. With no further constraints, the parameters a and $v_i$ do not have a unique solution. You can see that by rearranging the terms in (1):

$y_{it} = (a + v_i) + x_{it}b + e_{it}$

Consider some solution which has, say $a=3$. Then we could just as well say that $a=4$ and subtract the value $1$ from each of the estimated $v_i$.

Thus, before (1) can be estimated, we must place another constraint on the system. Any constraint will do, and the choice we make will have no effect on the estimated $b$. One popular constraint is $a=0$, but we could just as well constrain $a=3$. Changing the value of $a$ would merely change the corresponding values of $v_i$. Nor do we have to constrain $a$; we could place a constraint on $v_i$. We could, for instance, constrain $v_i = 0$ or $v_5 = 3$.

The constraint that xtreg, fe places on the system is computationally more difficult:

$$\sum^N_{i=1} \quad \sum^{T_i}_{t=1} \: v_i = 0 \qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad \text{(c1)}$$

This constraint means that the panel fixed effects sum to $0$ across all observations in the sample. If the panels are unbalanced the $v_i$ are effectively weighted by the number of observations in the panel.

Because the constraint we choose is arbitrary, we chose a constraint that makes interpreting the results more convenient. The random-effects estimator proceeds under the *ASSUMPTION* that $E(v)=0$ and hence can estimate an intercept. We parameterize the fixed-effects estimator so that it proceeds under the *CONSTRAINT* (c1). This constraint has no implication since we had to choose some constraint anyway.

The primary advantage of this constraint is that if you fit some model and then obtain the predictions

. xtreg y x1 x2 x3, fe . predict yhat

then the average value of $\hat{y}$ will equal the average value of $y$. To obtain estimates with the fixed-effects estimator, we had to impose an arbitrary constraint and had we instead constrained $a=0$, predict yhat would have produced $\hat{y}$ with average value $0$. That would be the only difference; the predictions would differ by a constant (namely, by their respective values of $a$).

Using the constraint (c1) has another advantage. Let us draw a distinction between models and estimators. The *MODEL* is

$$y_{it} = a + x_{it}b + v_i + e_{it} \qquad\qquad\qquad\qquad\qquad\qquad\quad \text{(1)}$$

Under the random-effects *MODEL*, it is assumed that $E(v)=0$ and that $v_i$ and $x_{it}$ are uncorrelated. From that model, we can derive the random-effects *ESTIMATOR*.

Under the fixed-effects *MODEL*, no assumptions are made about $v_i$ except that they are fixed parameters. From that model, we can derive the fixed-effects *ESTIMATOR*.

It turns out that the fixed-effects *ESTIMATOR* is an admissible estimator for the random-effects *MODEL*; it is merely less efficient than the random-effects *ESTIMATOR*. That is,


                     model      
        Estimator           fixed effects             random effects
   
         fixed effects       appropriate                appropriate
        random effects      inappropriate               appropriate

When you use the fixed-effects *ESTIMATOR* for the random-effects *MODEL*, the intercept a reported by xtreg, fe is the appropriate estimate for the intercept of the random-effects model.

Derivation

The fixed-effects model is

$$y_{it} = a + x_{it}b + v_i + e_{it} \qquad\qquad\qquad\qquad\qquad\qquad\quad\: \text{(1)}$$

From which it follows that

$$\overline{y}_i = a + \overline{x}_i b + v_i + \overline{e}_{i} \qquad\qquad\qquad\qquad\qquad\qquad\qquad \text{(2)}$$

where

$$\overline{y}_i \qquad \overline{x}_i \qquad \overline{e}_i$$

are with averages of

$$y_{it} \qquad x_{it} \qquad e_{it}$$

within $i$.

Subtracting (2) from (1), we obtain

$$y_{it} - \overline{y}_i = (x_{it} - \overline{x}_i)b + (e_{it} - \overline{e}_i) \qquad\qquad\qquad\qquad\quad\: \text{(3)}$$

Equation (3) is the way many people think about the fixed-effects estimator. $a$ remains unestimated in this formula. From (1), it also follows that

$$\overset{=}{y} = a + \overset{=}{x}b + \overline{v} + \overset{=}{e} \qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad \text{(4)}$$

where

$$\overset{=}{y} \qquad \overset{=}{x} \qquad \overline{v} \qquad \overset{=}{e}$$

are the grand averages of

$$y_{it} \qquad x_{it} \qquad v_i \qquad e_{it}$$

For instance,

$$\overset{=}{y} = \frac{\sum^n_{i=1} \qquad \sum^{T_i}_{t=1} \qquad y_{it}} {\text{total_number_of_observations}} $$

Summing (3) and (4), we obtain

$$y_{it} - \overline{y}_i + \overset{=}{y} = a + (x_{it} - \overline{x}_i + \overset{=}{x})b + (e_{it} - \overline{e}_i + \overline{v}) + \overset{=}{e} \qquad\qquad \text{(5)}$$

xtreg, fe estimates the above equation under the constraint

$$\overline{v} = 0$$

which is to say, it estimates

$$y_{it} - \overline{y}_i + \overset{=}{y} = a + (x_{it} - \overline{x}_i + \overset{=}{x})b + \text{noise}$$

Thus the left-side variable is $y_{it}$ minus the within-group means but with the grand mean added back in, and the right-side variables are $x_{it}$ minus the within-group means but with the grand mean added back in. Obviously, adding in grand means to the left and right sides has no affect on the estimated $b$.

Demonstration

Fixed-effects regression is supposed to produce the same coefficient estimates and standard errors as ordinary regression when indicator (dummy) variables are included for each of the groups. Because the fixed-effects model is

$$y_{ij} = X_{ij}b + v_i + e_{it}$$

and $v_i$ are fixed parameters to be estimated, this is the same as

$$y_{ij} = X_{ij}b + v_1d1_i + v_2d2_i + \: ... \: e_{it}$$

where $d1$ is $1$ when $i=1$ and $0$ otherwise, $d2$ is $1$ when $i=2$ and $0$ otherwise, and so on. $d1$, $d2$, $...$, are just dummy variables indicating the groups, and $v_1$, $v_2$, $...$, are their regression coefficients, which we must estimate.

The problem is that we typically have lots of groups—perhaps thousands—and including lots of dummy variables is too computationally expensive, so we look for a shortcut.

Nevertheless, we could take a little dataset with just a few groups and compare the methods. Here is my little dataset:

. list


     group    x    y 
 1.      1    0   -5 
 2.      1    8   23 
 3.      1   17   44 
 4.      2   10   29 
 5.      2   16   26 
 6.      3    4   17 
 7.      3   11   17 
 8.      3    5   31 
 9.      4   18   50 
10.      4    5   26 
11.      4    2   17

I am going to show you

what regress with group dummies reports;
that xtreg, fe reports the same results;
that removing the within-group means and estimating a regression on the deviations without an intercept (as given in equation 3) produces the same coefficients but different standard errors.

How can method 3 be wrong? Because it fails to account for the fact that the means we removed are *ESTIMATES*. As a consequence, it understates standard errors.

1. What regress with group dummies reports

. regress y x i.group


      Source         SS           df       MS      Number of obs   =        11
      F(4, 6)         =      4.01
       Model    1554.16667         4  388.541667    Prob > F        =    0.0643
    Residual    581.833333         6  96.9722222    R-squared       =    0.7276
      Adj R-squared   =    0.5460
       Total          2136        10       213.6    Root MSE        =    9.8474


 
           y   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
           x            2   .5372223     3.72   0.010     .6854644    3.314536
               
       group   
          2          -2.5   9.332493    -0.27   0.798    -25.33579    20.33579
          3      4.333333   8.090107     0.54   0.611    -15.46245    24.12911
          4      10.33333   8.040407     1.29   0.246    -9.340834     30.0075
               
       _cons            4   7.236455     0.55   0.600    -13.70697    21.70697

2. xtreg, fe reports the same results

. xtset group

Panel variable: group (unbalanced)

. xtreg y x, fe

Fixed-effects (within) regression               Number of obs     =         11
Group variable: group                           Number of groups  =          4

R-squared:                                      Obs per group:
     Within  = 0.6979                                         min =          2
     Between = 0.1716                                         avg =        2.8
     Overall = 0.6146                                         max =          3

                                                F(1,6)            =      13.86
corr(u_i, Xb) = -0.1939                         Prob > F          =     0.0098



 
           y   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
           x            2   .5372223     3.72   0.010     .6854644    3.314536
       _cons     7.545455   5.549554     1.36   0.223    -6.033816    21.12472

     sigma_u    5.6213466
     sigma_e    9.8474475
         rho    .24577354   (fraction of variance due to u_i)
 
F test that all u_i=0: F(3, 6) = 0.83                         Prob > F = 0.5241

If you compare, you will find that regress with group dummies reported the same coefficient (2) and the same standard error (.5372223) for x as xtreg, fe just did. In both cases, the t statistic is 3.72.

3. Fitting the deviation model reports incorrect standard errors

. egen double ybar = mean(y), by(group)
 
. egen double xbar = mean(x), by(group)
 
. gen yd = y-ybar
 
. gen xd = x-xbar
 
. regress yd xd, noconstant


      Source         SS           df       MS      Number of obs   =        11
      F(1, 10)        =     23.10
       Model    1343.99999         1  1343.99999    Prob > F        =    0.0007
    Residual    581.833327        10  58.1833327    R-squared       =    0.6979
      Adj R-squared   =    0.6677
       Total    1925.83332        11  175.075756    Root MSE        =    7.6278


 
          yd   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
          xd            2   .4161306     4.81   0.001     1.072803    2.927197

So, to summarize:


                     x                         
                Coefficient     Std. Err.     t
   
regress with dummies            2           .5372223     3.72 
xtreg, fe                       2           .5372223     3.72 
removing the means              2           .4161306     4.81

regress with dummies definitionally calculates correct results.

xtreg, fe matches them.

Removing the means and estimating on the deviations with the noconstant option produces correct coefficients but incorrect standard errors. Why? Because we did not account for the fact that the means we removed from y and x were estimated.

How can there be an intercept in the fixed-effects model estimated by xtreg, fe?

Intuition

Derivation

Demonstration

1. What regress with group dummies reports

2. xtreg, fe reports the same results

3. Fitting the deviation model reports incorrect standard errors

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

		model
Estimator		fixed effects random effects

fixed effects		appropriate appropriate
random effects		inappropriate appropriate

	group x y
1.	1 0 -5
2.	1 8 23
3.	1 17 44
4.	2 10 29
5.	2 16 26
6.	3 4 17
7.	3 11 17
8.	3 5 31
9.	4 18 50
10.	4 5 26
11.	4 2 17

Source	SS df MS	Number of obs = 11
		F(4, 6) = 4.01
Model	1554.16667 4 388.541667	Prob > F = 0.0643
Residual	581.833333 6 96.9722222	R-squared = 0.7276
		Adj R-squared = 0.5460
Total	2136 10 213.6	Root MSE = 9.8474


y		Coefficient Std. err. t P>\|t\| [95% conf. interval]

x		2 .5372223 3.72 0.010 .6854644 3.314536

group
2		-2.5 9.332493 -0.27 0.798 -25.33579 20.33579
3		4.333333 8.090107 0.54 0.611 -15.46245 24.12911
4		10.33333 8.040407 1.29 0.246 -9.340834 30.0075

_cons		4 7.236455 0.55 0.600 -13.70697 21.70697


yd		Coefficient Std. err. t P>\|t\| [95% conf. interval]

xd		2 .4161306 4.81 0.001 1.072803 2.927197

		x
		Coefficient Std. Err. t

regress with dummies		2 .5372223 3.72
xtreg, fe		2 .5372223 3.72
removing the means		2 .4161306 4.81

Stata/MP4 Annual License (download)

How can there be an intercept in the fixed-effects model estimated by xtreg, fe?

Intuition

Derivation

Demonstration

1. What regress with group dummies reports

2. xtreg, fe reports the same results

3. Fitting the deviation model reports incorrect standard errors

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies