Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Difference in Difference vs. Fixed Effects

From   Joerg Lang <>
Subject   Re: st: Difference in Difference vs. Fixed Effects
Date   Wed, 2 Oct 2013 18:43:00 +0200

Dear David,

thanks for your response and the reference.

The assignement was not randomized. Rather the households could self
select themselves in the treatment which is in my case solar panels.
Note that my data is on household level.
I don't think that the binary structure of the outcome variable, which
is an index calculated from 7 variables, is a severe problem. I
applied a logit and a probit estimation and the result are similar. As
well, I predicted the values of the above given -reg command and they
are all bounded between 0 and 1.

I have already looked into some books, especially with fixed effects.
What has to be noted, though, is that the first model above, as it is
specified, rather is a pooled model and the other one a fixed effects
model. Thus, from a theoretic point of view, the difference could be
attributed to an unobserved individual heterogeneity. At least, this
is what I thought. However, I have not found any book that makes this
clear. Rather, all of the ones I looked into simply state that the
result should be the same if one period only is observed. If there is
anyone here that could help me that would be very much appreciated. Is
there anything wrong with my Stata command? Or, if not, what do I then

2013/10/2 David Hoaglin <>:
> Dear Joerg,
> If your outcome variable has values only between 0 and 1, ordinary
> regression may not be appropriate.  Please tell us more about the
> nature of your outcome variable.
> Are your treatment and control groups observational (i.e., not the
> result of randomized assignment)?  Please tell us more about that
> aspect of your data.
> The fixed-effects model does not remove only the effect of treatment.
> It removes the effects of all potential confounders, both observed and
> unobserved, that do not vary with time.  The book by Fitzmaurice,
> Laird, and Ware (2011) has an accessible discussion of fixed-effects
> and random-effects models, and it should be possible to find other
> relevant articles and books.
> David Hoaglin
> Fitzmaurice GM, Laird NM, Ware JH (2011). Applied Longitudinal
> Analysis, second edition.  Wiley.
> On Wed, Oct 2, 2013 at 2:26 AM, Joerg Lang <> wrote:
>> Dear Stalist users,
>> currently writing my Master thesis and working with
>> Stata 12, I have the following problem.
>> I have a dataset on two time periods (2010 and 2012) and two groups
>> (treatment and control). There is no treatment in the baseline and the
>> treatment group uptakes the treatment between 2010 and 2012. The uptake is
>> non-random.
>> Now, I want to estimate the impact in a difference in difference design.
>> At first, I estimate the following model:
>>  y b0+b1Time+b2Treatment+b3Time*Treatment+u
>> using the -reg command:
>> -reg y time treatment time*treatment, cluster (h1)
>> while y is the outcome variable that is between 0 and 1 and h1 is the
>> household identifier. I use the cluster option to account for the problem
>> of serial correlation. In a second estimation I also include some other
>> covariates.
>> I always thought that this setting and a setting with fixed effects
>> yield exactly the same result as long as one has only two points in time
>> (in my case 2010 and 2012).
>> However, estimating the same model with:
>> - xtreg y Time Treatment Time*Treatment, fe vce(cluster h1)
>> gives slightly different results. The difference increases when including
>> more covariates, which are the same in both cases. As well, there is no
>> variation in the households. Thus, the same households and the same
>> variables are used in both estimations.
>> Obviously, treatment is omitted in the xtreg case since it does not vary
>> between time.
>> However, I think that this should not change anything.
>> My question is:
>>  Is my model correctly specified or did I overlook something?
>> And, if my estimation is correct: Why this difference? Is this "normal"? If
>> so, what does it tell me then, i.e. what is the reason for it?
>> Since I have already been stuck with this problem for quite a while, any
>> help or literature suggestions  would be very much appreciated.  Hope
>> this question is not too trivial for you.
>> Best regards,
>> Joerg
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index