Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Sjoerd van Bekkum <vanbekkum@ese.eur.nl> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Difference-in-Differences and Panel Data - In search of an adequate regression |

Date |
Thu, 3 Jan 2013 18:32:05 +0100 |

Austin, I fully agree. Nobody would contest clustering (my first point merely points out an even more severe issue in Barbara's original post). My model contains group fixed effects, not individual ones, but your point (I think) that one cannot include both is valid since a group FE model would be nested within an individual FE model. However, I think neither of our points is exactly what Barbara means, judging from her latest reply (in new thread but copied below): Barbara, if you have, say, N*T observations for N individuals, T time periods, and J treatment groups (J <= N), clustering over id indicates N independent observations. However, if you suspect that individuals within the groups are correlated, you should cluster over groups rather than id. This indicates only J independent observations. --- **** Barbara's latest message: **** Dear Sjoerd, thanks a lot for your kind reply. I got the basic assumptions of the DiD and my model contains (of course) the treatment group and time dummies as well as the interaction term. I was too brief on that, sorry! I am just too confused about the clustering: With panel data, I surely cluster for the individuals, let's say their id: reg y x d1 d2 d1d2, cluster(id) this accounts for the correlation between the residuals that arise because the observations within one individuals are likely to be non-independent. but how do i account for common group errors, i.e. errors that might affect the whole treatment group? Am I done clustering for the individuals? Or am I just terribly wrong? I would greatly appreciate another kind answer. Best from Berlin, Barbara On 3 January 2013 16:13, Austin Nichols <austinnichols@gmail.com> wrote: > > Sjoerd van Bekkum <vanbekkum@ese.eur.nl>: > OP asked about FE, which presumably are collinear with the treatment > dummy and its interaction. You did not include fixed effects in your > model. As I noted, the cluster-robust SE *do* make sense, but the FE > probably not (unless some FE not collinear with the treatment are > meant). > > On Wed, Jan 2, 2013 at 10:09 PM, Sjoerd van Bekkum <vanbekkum@ese.eur.nl> > wrote: > > Maybe my post was too brief. I think what Barbara wants to do (and > > what I meant in my previous post) is, assuming two groups and 2 > > (pre-/post) periods: > > > > y = a + b*D(treatment=1) + c*D(post=1) + d*D(treatment=1)*D(post=1) + e, > > > > where D are indicator variables. As mentioned in the paper I cited > > above, this leads to the following groups: > > > > E[y|treated, post] =a+b+c+d > > E[y|treated, pre] = a+b > > E[y|not treated,post] = a+c > > E[y|not treated, pre] = a > > > > with the dif-in-dif captured by > > > > DID = {E[y|treated, post]-E[y|treated, pre]} - {E[y|not > > treated,post]-E[y|not treated, pre]} > > = {a+b+c+d - (a+b)} - {a+c - a} > > = d > > > > with cluster-robust errors, as Austin mentioned. I don't see any > > collinearity problems here. > > > > > > On 3 January 2013 02:03, Austin Nichols <austinnichols@gmail.com> wrote: > >> > >> Barbara Engels <engels.ba@gmail.com> : > >> You should certainly use cluster-robust SE to account for repeated > >> observations, but how could you include FE and a dummy for treatment > >> group? With a post dummy, and a treatment dummy, and the interaction, > >> there would be a severe perfect collinearity problem. > >> > >> On Wed, Jan 2, 2013 at 3:49 PM, Barbara Engels <engels.ba@gmail.com> > >> wrote: > >> > Dear Stata people, > >> > > >> > I am currently working on a difference-in-differences model in its > >> > simplest form - treatment and control group, pre- and > >> > post-intervention > >> > period. > >> > However, I got a large panel data set and I wonder what is the best > >> > way > >> > to estimate the DID in Stata to account for flaws like serial > >> > correlation. > >> > Should I go for a simple > >> > > >> > reg y x incl. interaction term, ROBUST > >> > > >> > Or should I apply clustering? > >> > Or even xtreg with fe? > >> > > >> > Any help is greatly appreciated. > >> > > >> > Thanks a lot, happy 2013! > >> > > >> > Barbara > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Difference-in-Differences and Panel Data - In search of an adequate regression***From:*Barbara Engels <engels.ba@gmail.com>

**Re: st: Difference-in-Differences and Panel Data - In search of an adequate regression***From:*Sjoerd van Bekkum <vanbekkum@ese.eur.nl>

- Prev by Date:
**Re: st: Stata 12 issues with .csv files** - Next by Date:
**Re: st: GMM estimation.** - Previous by thread:
**Re: st: Difference-in-Differences and Panel Data - In search of an adequate regression** - Next by thread:
**Re: st: Difference-in-Differences and Panel Data - In search of an adequate regression** - Index(es):