Difference-in-differences (DID) and DDD models

Order

Watch video demo

<- See Stata's other features

Highlights

DID and DDD ATET estimators for repeated cross-sections and panel data
Wild-bootstrap p-values and confidence intervals
Bias-corrected standard errors using the Bell and McCaffrey degrees-of-freedom adjustment
ATET estimates and standard errors using the Donald and Lang method
Mean-outcome and parallel-trends graphical diagnostics
Granger-type and parallel-trends tests
Time-specific treatment effects
See more causal inference features

Stata's didregress and xtdidregress commands fit DID and DDD models that control for unobserved group and time effects. didregress can be used with repeated cross-sectional data, where we sample different units of observations at different points in time. xtdidregress is for use with panel (longitudinal) data. These commands provide a unified framework to obtain inference that is appropriate for a variety of study designs.

When average treatment effects vary over time and over cohort, you can use the hdidregress and xthdidregress commands to estimate heterogeneous average treatment effects on the treated (ATETs).

Difference in differences (DID) offers a nonexperimental technique to estimate the ATET by comparing the difference across time in the differences between outcome means in the control and treatment groups, hence the name difference in differences. This technique controls for unobservable time and group characteristics that confound the effect of the treatment on the outcome.

Difference in difference in differences (DDD) adds a control group to the DID framework to account for unobservable group- and time-characteristic interactions that might not be captured by DID. It augments DID with another difference for the new control group, hence the name difference in difference in differences.

Examples of treatment effects include examining the effects of a drug regimen on blood pressure, a surgical procedure on mobility, a training program on employment, or an ad campaign on sales.

Let's see it work

A health provider wants to study the effect of a new hospital admissions procedure on patient satisfaction using monthly data on patients before and after the new procedure was implemented in some of their hospitals. The health provider will use DID regression to analyze the effect of the new admissions procedure on the hospitals that participated in the program. The outcome of interest is patient satisfaction, satis, and the treatment variable is procedure. We can fit this model using didregress.

. webuse hospdd
. didregress (satis) (procedure), group(hospital) time(month)

The first set of parentheses is used to specify the outcome of interest followed by the covariates in the model. In this case, there are no covariates. The second set of parentheses is used to specify the binary variable that indicates the treated observations, procedure. The group() and time() options are used to construct group and time fixed effects that are included in the model. The variable specified in group() defines the level of clustering for the default cluster–robust standard errors. For this example, we cluster at the hospital level. The results from this command are

Treatment and time information

Time variable: month
Control:       procedure = 0
Treatment:     procedure = 1

                Control  Treatment

Group                             
    hospital         28         18

Time                              
     Minimum          1          4
     Maximum          1          4



Difference in differences regression                     Number of obs = 7,368
Data type: Repeated cross-sectional

                              (Std. err. adjusted for 46 clusters in hospital)

                             Robust                          
        satis   Coefficient  std. err.      t    P>|t|     [95% conf. interval]

ATET                                                                     
    procedure                                                            
(New vs Old)      .8479879   .0321121    26.41   0.000     .7833108     .912665
 Note: ATET estimate adjusted for group effects and time effects.

The first table gives information about the control and treatment groups and about treatment timing. The first section tells us that 28 hospitals continued to use the old procedure and 18 hospitals switched to the new one. The second section tells us that all hospitals that implemented the new procedure did so in the fourth time period. If some hospitals had adopted the policy later, the minimum and maximum time of the first treatment would differ.

The second table gives the estimated ATET, 0.85 (95% CI [0.78,0.91]). Treatment hospitals had a 0.85-point increase in patient satisfaction relative to if they hadn't implemented the new procedure.

One of the assumptions this model makes is that the trajectories of satis are parallel for the control and treatment groups prior to implementation of the new procedure. A visual check of these trajectories can be obtained by plotting the means of the outcome over time for both groups or by visualizing the results of the linear-trends model. We can perform both of these diagnostic checks using estat trendplots.

. estat trendplots

Prior to the policy implementation, control and treatment hospitals followed a parallel path. We can further evaluate this assumption using a parallel-trends test with estat ptrends.

. estat ptrends

Parallel-trends test (pretreatment time period)
H0: Linear trends are parallel

F(1, 45) =   0.55
Prob > F = 0.4615

We do not have sufficient evidence to reject the null hypothesis of parallel trends. This test and the graphical analysis support the parallel-trends assumption.

Another test we may want to conduct is to see if, in anticipation of treatment, the control or treatment groups change their behavior. This is evaluated with the Granger causality test using estat granger.

. estat granger

Granger causality test
H0: No effect in anticipation of treatment

F(2, 45) =   0.33
Prob > F = 0.7239

We do not have sufficient evidence to reject the null hypothesis of no behavior change prior to treatment. Together with our previous diagnostics, these results suggest that we should trust the validity of our ATET estimate.

In this example, we had a sufficient number of hospitals (46) to make reliable inferences about our treatment effect. If we only had data from 15 hospitals, however, we may have considered alternative methods.

To use bias-corrected standard errors with the Bell and McCaffrey (2002) degrees-of-freedom adjustment, we can add the vce(hc2) option.

. didregress (satis) (procedure), group(hospital) time(month) vce(hc2)

To use the aggregation method proposed by Donald and Lang (2007), we can add the aggregate(dlang) option.

. didregress (satis) (procedure), group(hospital) time(month) aggregate(dlang)

We can add the varying option if we wanted to allow some of the coefficients to vary across groups.

. didregress (satis) (procedure), group(hospital) time(month) aggregate(dlang, varying)

We can also use the wild-cluster bootstrap to obtain p-values and confidence intervals. As with all bootstrap-type methods, we need to set a seed to make our results replicable.

. didregress (satis) (procedure), group(hospital) time(month) wildbootstrap(rseed(111))

Performing 1,000 replications for p-value for constraint
  procedure = 0 ...
Computing confidence interval for procedure
  Lower bound: ...... done (6)
  Upper bound: ...... done (6)
Treatment and time information

Time variable: month
Control:       procedure = 0
Treatment:     procedure = 1

                Control  Treatment

Group                             
    hospital          7          8

Time                              
     Minimum          1          4
     Maximum          1          4



DID with wild-cluster bootstrap inference              Number of obs   = 2,192
                                                       Replications    = 1,000
Data type:    Repeated cross-sectional
Error weight: rademacher



       satis   Coefficient     t    P>|t|     [95% conf. interval]

ATET                                                              
   procedure  
(New vs Old)     .860162    19.72   0.000     .7714875    .9587552
 Note: ATET estimate adjusted for group effects and time effects.

The confidence interval and p-value above provide reliable inference for cases where the number of groups is small. These results can be interpreted in the same way as our original model.

References

Bell, R. M., and D. F. McCaffrey. 2002. Bias reduction in standard errors for linear regression with multi-stage samples. Survey Methodology 28: 169181.

Donald, S. G., and K. Lang. 2007. Inference with difference-in-differences and other panel data. Review of Economics and Statistics 89: 221233.

Tell me more

You can read more about the intuition behind DID and its implementation in Stata in [CAUSAL] DID intro.

You can see more details on this method and some additional examples in [CAUSAL] didregress.

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

News and events

Customer service

Careers

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies


		Robust
satis		Coefficient std. err. t P>\|t\| [95% conf. interval]

ATET
procedure
(New vs Old)		.8479879 .0321121 26.41 0.000 .7833108 .912665


satis		Coefficient t P>\|t\| [95% conf. interval]

ATET
procedure
(New vs Old)		.860162 19.72 0.000 .7714875 .9587552