Home  /  Stata News  /  Vol 38 No 3  /  In the spotlight: Heterogeneous DID: A new way to approach an old problem
The Stata News

«Back to main page

In the spotlight: Heterogeneous DID: A new way to approach an old problem

The first reference to difference in differences (DID) is John Snow's work in the 1860s. He was trying to demonstrate that contaminated water caused cholera and to refute alternative explanations. But he needed to ascertain the existence of a causal mechanism abstracting from any confounding effects. This is at the heart of causal inference. It is an old human desire. We like to know if an action we undertake has a causal consequence and is not confounded by other factors.

DID controls for group- and time-unobservable effects. What is left, after controlling for both of these effects, is the effect of a treatment on the group subject to the intervention, an average treatment effect on the treated (ATET). In Stata 17, we provided DID estimators for cross-sectional data (didregress) and for longitudinal data (xtdidregress). We also provided tools to obtain diagnostics for the assumptions usually required to determine the validity of the approach. We provided diagnostics for parallel or common trend (estat ptrends) and tests of anticipation (estat granger). Both of them have graphical and tabular outputs.

Yet there is another assumption underlying the validity of the estimators in didregress and xtdidregress that needs to be accounted for. Both of the estimators assume that the ATET does not change over time and that, if individuals are treated at different points in time, their ATETs are the same. In other words, the ATETs are assumed to be homogeneous across time and treatment cohorts.

In Stata 18, you can check whether homogeneity is to be trusted. After didregress or xtdidregress, you would type

. estat bdecomp

This gives you the Bacon decomposition. It decomposes the ATET into a weighted average of two-period two-group treatment effects, which lets us know if there is timing or group variation. It also points out “bad” comparisons that occur when running didregress or xtdidregress. These occur when an already treated group acts as a control group.

So what do you do if there is ATET heterogeneity or if you do not want to assume homogeneity? Use the new hdidregress and xthdidregress commands to fit models for which the ATET changes over time or over treatment cohort. You will get ATET estimates that vary over treatment cohort and time, and you may graph these effects.

Let's see it work

Suppose we are interested in how the number of registrations of a dog breed with the American Kennel Club (AKC), registered, is affected by dogs being the protagonists in a movie, movie. We conjecture that the number of registrations increases if the dog breed appears as the protagonist in a movie. We also conjecture that registrations increase if the dog has won the Best in Show award from the Westminster Kennel Club, best, in the 10 years before 2034. We use simulated future data, but there is some past evidence of the effect of movies on dog breed registrations. See, for example, Ghirlanda, Acerbi, and Herzog (2014).

There are 141 dog breeds in our sample, which ranges between the years 2031 and 2040. At the beginning of the sample, none of the breeds are featured in a movie. This changes in 2034, when four breeds are featured in movies. The next year in which we see an increase of breeds featured in movies is 2036, when 7 breeds are featured. In 2037, there is a substantial increase, with 22 breeds featured. The table below illustrates this.

. webuse akc, clear
(Fictional dog breed and AKC registration data)

. tabulate year movie

Was a movie
protagonist
Year 0 1 Total
2031 141 0 141
2032 141 0 141
2033 141 0 141
2034 137 4 141
2035 137 4 141
2036 134 7 141
2037 119 22 141
2038 119 22 141
2039 119 22 141
2040 119 22 141
Total 1,307 103 1,410

There are three treatment cohorts (2034, 2036, and 2037). We conjecture that treatment effects differ and evolve over time. To fit a regression adjustment model after using xtset, we type

. xtset breed year

. xthdidregress ra (registered best) (movie), group(breed)

In the first set of parentheses, we define the outcome, registered, and any covariates that affect the outcome directly. In the second set of parentheses, we define the observation-level treatment variable, movie. After the comma, we need to define the group variable in group(); this is a required option. The group variable defines at which level the treatment occurs and also identifies the clustering variable, which in this case is breed. We obtain

. xthdidregress ra (registered best) (movie), group(breed)
note: variable _did_cohort, containing cohort indicators formed by treatment
      variable movie and group variable breed, was added to the dataset.

Computing ATET for each cohort and time:
Cohort 2034 (9): ......... done
Cohort 2036 (9): ......... done
Cohort 2037 (9): ......... done

Treatment and time information

Time variable: year
Time interval: 2031 to 2040
Control:       _did_cohort = 0
Treatment:     _did_cohort > 0
_did_cohort
Number of cohorts 4
Number of obs
Never treated 1190
2034 40
2036 30
2037 150
Heterogeneous-treatment-effects regression Number of obs = 1,410 Number of panels = 141 Estimator: Regression adjustment Panel variable: breed Treatment level: breed Control group: Never treated (Std. err. adjusted for 141 clusters in breed)
Robust
Cohort ATET std. err. z P>|z| [95% conf. interval]
2034
year
2032 -254.8927 266.1024 -0.96 0.338 -776.4439 266.6584
2033 -257.5329 217.9389 -1.18 0.237 -684.6852 169.6194
2034 701.1318 127.0935 5.52 0.000 452.0331 950.2304
2035 1099.044 282.0704 3.90 0.000 546.196 1651.892
2036 1367.632 225.8702 6.05 0.000 924.9343 1810.329
2037 2008.294 237.2396 8.47 0.000 1543.313 2473.275
2038 2472.624 278.2949 8.88 0.000 1927.176 3018.072
2039 2689.615 504.3324 5.33 0.000 1701.142 3678.088
2040 3110.97 568.916 5.47 0.000 1995.915 4226.025
2036
year
2032 216.0259 122.9107 1.76 0.079 -24.87472 456.9265
2033 -172.5154 372.0776 -0.46 0.643 -901.7741 556.7433
2034 -218.0495 504.5267 -0.43 0.666 -1206.904 770.8045
2035 621.033 156.1306 3.98 0.000 315.0227 927.0434
2036 999.0781 180.1055 5.55 0.000 646.0779 1352.078
2037 1003.333 250.5916 4.00 0.000 512.1829 1494.484
2038 1556.669 451.6914 3.45 0.001 671.3697 2441.967
2039 2590.674 662.6979 3.91 0.000 1291.81 3889.538
2040 2225.712 486.9917 4.57 0.000 1271.225 3180.198
2037
year
2032 -114.582 160.0972 -0.72 0.474 -428.3668 199.2028
2033 -127.9856 183.3941 -0.70 0.485 -487.4315 231.4603
2034 33.40901 168.0312 0.20 0.842 -295.9262 362.7442
2035 130.3495 166.2261 0.78 0.433 -195.4477 456.1468
2036 -10.48288 167.5059 -0.06 0.950 -338.7884 317.8226
2037 1717.016 268.5592 6.39 0.000 1190.65 2243.383
2038 2086.798 278.0215 7.51 0.000 1541.886 2631.71
2039 2473.611 268.186 9.22 0.000 1947.976 2999.246
2040 2835.117 378.6699 7.49 0.000 2092.938 3577.296
Note: ATET computed using covariates.

Notice the note below the command. A variable with the name _did_cohort has been generated. Using the group variable and the observation-level treatment, xthdidregress generated treatment-time cohorts. The new variable creates treatment groups based on the time when a group was first treated. For instance, if a Boxer and a Rottweiler are featured in movies in 2034, they are grouped in the 2034 cohort. The variable also contains a category for a control group. In this case, the control group is formed by the breeds that are not featured in a movie. Cohorts are important inputs for estimation and postestimation commands.

Next appears a table that gives you a sense of the treatment groups and times. You see the time variable, year, and its range, 2031 to 2040. Then we see what defines a treated or a control group. The table below that provides group-level information about the cohort–time groups. The first row tells you the number of cohorts. Following the number of cohorts is a tabulation showing how many observations are in each cohort. For instance, 1,190 observations are never treated in our data. The table gives you a sense of the amount of information available in each cohort and might hint at the variability of cohort-level estimates.

It is difficult to see the trends in ATETs just by looking at all the ATET estimates. We can use estat atetplot to visualize the time profile of the ATETs for each cohort.

. estat atetplot

atetplot.svg

After fitting the model, we can use estat aggregation to aggregate the ATETs within cohort, time, and exposure to treatment. This command provides a summary of different aspects of ATETs. For example, we use estat aggregation, cohort to summarize the ATETs within cohort. We also specify option graph to obtain a graph of aggregations in addition to the tabular output.

. estat aggregation, cohort graph

ATET over cohort                                         Number of obs = 1,410

                                (Std. err. adjusted for 141 clusters in breed)
Robust
Cohort ATET std. err. z P>|z| [95% conf. interval]
2034 1921.33 187.2787 10.26 0.000 1554.271 2288.389
2036 1675.093 130.4929 12.84 0.000 1419.332 1930.855
2037 2278.136 166.5283 13.68 0.000 1951.746 2604.525

cohort.svg

If we want to summarize ATETs within time, we specify option time with estat aggregation.

. estat aggregation, time graph

ATET over time                                           Number of obs = 1,410

                                (Std. err. adjusted for 141 clusters in breed)
Robust
Time ATET std. err. z P>|z| [95% conf. interval]
2034 701.1318 127.0935 5.52 0.000 452.0331 950.2304
2035 1099.044 282.0704 3.90 0.000 546.196 1651.892
2036 1209.68 170.2043 7.11 0.000 876.0858 1543.275
2037 1672.655 202.1854 8.27 0.000 1276.379 2068.932
2038 2084.658 214.5072 9.72 0.000 1664.232 2505.084
2039 2528.847 225.8763 11.20 0.000 2086.138 2971.557
2040 2802.171 291.8412 9.60 0.000 2230.173 3374.17

time.svg

Finally, if we want to summarize ATETs over different lengths of exposure to treatment, we specify option dynamic.

. estat aggregation, dynamic graph

Duration of exposure ATET                                Number of obs = 1,410

                                (Std. err. adjusted for 141 clusters in breed)
Robust
Exposure ATET std. err. z P>|z| [95% conf. interval]
-5 -114.582 160.0972 -0.72 0.474 -428.3668 199.2028
-4 -70.65034 156.3185 -0.45 0.651 -377.029 235.7283
-3 -.9117242 153.0999 -0.01 0.995 -300.982 299.1585
-2 12.79653 144.8216 0.09 0.930 -271.0486 296.6417
-1 30.71473 132.8508 0.23 0.817 -229.668 291.0975
0 1434.409 206.3277 6.95 0.000 1030.014 1838.804
1 1759.461 224.0229 7.85 0.000 1320.385 2198.538
2 2147.486 221.903 9.68 0.000 1712.564 2582.408
3 2651.452 284.8928 9.31 0.000 2093.073 3209.832
4 2366.805 267.4253 8.85 0.000 1842.661 2890.949
5 2689.615 504.3324 5.33 0.000 1701.142 3678.088
6 3110.97 568.916 5.47 0.000 1995.915 4226.025
Note: Exposure is the number of periods since the first treatment time.

dynamic.svg

Parting words

In Stata, we can fit both heterogeneous and homogeneous ATETs, check model assumptions, and explore results in tabular and graphical forms. To learn more, see [CAUSAL] hdidregress and [CAUSAL] xthdidregress.

Reference

Ghirlanda, S., A. Acerbi, and H. Herzog. 2014. Dog movie stars and dog breed popularity: A case study in media influence on choice. PLOS ONE 9: e106565. https://doi.org/10.1371/journal.pone.0106565.

— by Enrique Pinzón
Director of Econometrics

«Back to main page