Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: questions about Fixed Effect models

From   "Wooldridge, Jeffrey" <>
To   <>
Subject   RE: st: RE: questions about Fixed Effect models
Date   Fri, 25 Feb 2011 17:05:28 -0500

Hi Austin:

The situation here is a bit different: unlike in the case of using the sample variance for, say, student fixed effects based on five time periods -- the usual situation -- the teacher effects are hopefully estimated using more like 100 students. This makes them much more precise. It's true that in the usual case the naïve sample variance is systematically biased. I agree that there is an adjustment that can be used in the teacher effect case, but it will be less important.

I've been doing a lot of simulations with co-authors on VAM estimation, and, although the setting is necessarily simplified, by far the most robust method for estimating the teacher effects is to use dynamic regression. Even when the theoretically best method is random effects on the first difference, the dynamic OLS estimator does almost as well. In other words, it is theoretically inconsistent but does a good job with the teacher effects. What Jesse's work does not recognize is a couple of important points:

1. Measurement error in the lagged y does not necessarily mean the teacher effects are badly estimated. My simulations with Cassie Guarino and Mark Reckase suggest otherwise.

2. In terms of the assignment of teachers to students, principals may very well be using the lagged observed score, which makes it the right thing to control for.

3. Jesse's argument is more from the perspective or the structural production function literature. My simulation work indicates this is much too limited. Even methods such as Arellano and Bond work much less well in certain simulations for estimating the teacher effects than just dynamic OLS.

Our paper got rejected from the QJE and we are currently revising it. I'd be happy to send you an older version.

Cheers, Jeff

-----Original Message-----
From: [] On Behalf Of Austin Nichols
Sent: Friday, February 25, 2011 4:51 PM
Subject: Re: st: RE: questions about Fixed Effect models

One problem when computing the variance of the teacher effects is that
these are noisily estimated; see e.g. the discussion in -fese- on SSC:
[FE] standard errors are not usually computed in a fixed-effects
regression, but we may need them. One example takes student test
scores as the dependent variable and teacher assignments as the
explanatory variables, as in Rothstein (2007), where the fixed effects
measure the assumed additive effect of a teacher on her students' test
scores. The variance of estimated fixed effects captures both the
variance of true fixed effects and the variance of the estimator: the
variance of true fixed effects (i.e. how disparate are teachers'
apparent impacts on students' scores) can be estimated as the observed
variance in estimated fixed effects less the best estimate for the
variance of the estimator, which is the mean of squared standard
Putting in lagged achievement (test score) is not in general a good
idea, since this is measured with error--if you instrument for lagged
achievement you will get a coef near one, whereas if you treat it as
measured without error, you get a coef nearer 0.6 which is presumably
biased downward by classical measurement error.  The equation y_t = .6
y_{t-1} + Xb may be subtracting off the wrong quantity, effectively
regressing y_t - .6 y_{t-1} on X, which can introduce bias in
estimates of b (add'l .4 times lagged ach in the error may be
correlated with X).  On the other hand, some would argue that "decay"
means that the true coef on lagged achievement should not really be

On Fri, Feb 25, 2011 at 4:27 PM, Wooldridge, Jeffrey <> wrote:
> Because I've been doing some work estimating teacher value added, I'll take a crack at this. First is an issue of terminology. While it is common to say things like "teacher fixed effects" when using student-level data, I'm not sure using the fixed effects commands in Stata (xtreg) is the right way to go. In fact, mechanically I'm not sure how you're doing it. Aren't the teacher effects the main quantities of interest? If so, you should just being using pooled OLS, putting in the lagged proficiency, and then including a full set of teacher dummy variables (with, presumably, a base teacher represented by the constant).
> None of the statistics that you mention would be relevant except perhaps the error variances. An interesting calculation is to compute the usual variance of the OLS residuals and then also compute the variance of the teacher effects. (You might have to export them or put them into a Stata matrix to do this.) This would tell you how important the teacher effect is relative to the overall variance in student proficiency.
> My Stata session would look something like this:
> xtset studentid year
> gen score_1 = l.score
> reg score score_1 i.teacherid i.year, cluster(studentid)
> (or just include a full set of teacher dummies if you have created them).
> It is also common to use student fixed or random effects on the differenced score:
> gen dscore = d.score
> xtreg dscore i.teacherid i.year, fe cluster(studentid)
> Jeff W.
> -----Original Message-----
> From: [] On Behalf Of Stata Email
> Sent: Friday, February 25, 2011 3:05 PM
> To:
> Subject: st: questions about Fixed Effect models
> Dear Statalist members
> I am new in panel data and I am working with fixed effect models. I
> would like to confirm if I am doing the right thing
> When working with panel data, the data set is such that we have
> information about individuals i and we observe these individuals
> through different time periods t. My questions are
> 1) Which part of the Stata output shows me that the fixed effect is important?
> 2) What does it mean exactly R-sq within? R-sq between?
> 3) If I run a fixed effect model, the sigma-u is the std dev of the
> residuals inside (within) each group of individuals i. So a higher
> number means that I have more variability inside each group?
> 4) sigma-e show the std dev of the residuals after excluding the
> variability inside each group i? If that is true, a higher number
> means that I have a big variability among groups i and therefore the
> fixed effect is important?
> Now let me explain what kind of data set I have. I have a data set
> with the proficiency level of students, followed for 5 years. But I
> know who is the teacher for every student in all 5 years. I want to
> calculate a teacher fixed effect (and I control for the proficiency
> level from the previous year instead of having a student fixed
> effect). My other questions are
> 5) My individuals i here are the teachers and, instead of having a
> time t, I have students s with the same teacher
> 6) All within statistic will refer the the differences among students
> with the same teacher?
> I really appreciate any comment
> Isabel

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index