»  Home »  Resources & support »  FAQs »  Error: Invalid group specification

## Why do I get an “invalid group specification r(198)” error when I fit a difference-in-differences (DID) model by using didregress or xtdidregress?

 Title The relationship between the group variable and the treatment dummy variable in didregress and xtdidregress Author Pei-Chun Lai, StataCorp

The DID model for repeated cross-sectional data fit by didregress is given by

$$Y_{ist}=γ_s+γ_t+z_{ist}\beta+D_{st}\delta+\varepsilon_{ist}$$

where $$i$$ is the observation-level index, $$s$$ is the group-level index, and $$t$$ is the time-level index. $$γ_s$$ denotes the group effects, $$γ_t$$ denotes the time effects, and $$z_{ist}$$ are the observation-level characteristics. The parameter of interest in this model is the average treatment effect on the treated (ATET), which is given by the ($$\delta$$). Note that there is no index (i) in the variable ($$D_{st}$$) because at a given point in time "all units" in group (s) are either subject to the treatment or are not. For example, if a specific tax policy is implemented in some states and not in others, all individuals in state (s) at a given point in time (t) are either subject to the tax policy (the treatment) or are not.

DID models are generally used to study the effect of a policy in a setting where there is a world before and after the policy when controlling for group ($$\gamma_s$$) and time effects ($$\gamma_t$$). $$D_{st}$$ is a binary variable that indicates the treated observations. For our state and time example, $$D_{st}=0$$ for either the state in the treated group or the state in the control group before the treatment occurs for the first time. After the treatment occurs, $$D_{st}=1$$ for the states in the treated group, but $$D_{st}=0$$ for the states in the control group. $$D_{st}$$ is put in the second set of parentheses of didregress. For didregress to work, we should have a control group that has at least one state for which all observations have $$D_{st}=0$$ over years.

If no states have $$D_{st}=0$$ over years, and all states have both observations of $$D_{st}=0$$ and $$D_{st}=1$$,


treatment
State    No treat  Yes treat       Total

0         405      1,029       1,434
1       2,343      2,265       4,608
2         784        973       1,757

Total       3,532      4,267       7,799


we cannot assign any state to the control group, and we will get the error message

invalid group specification
None of the groups defined by state is a control.
r(198)


after running didregress. In other words, we do not have a state that is never treated and that is in the control group in the sample.

If we examine the hospdd.dta used in the examples of the PDF manual entry for didregress, the group versus treatment table is

. use https://www.stata-press.com/data/r17/hospdd, clear
(Artificial hospital admission procedure data)

. tabulate hospital procedure

ID         Old        New       Total

1          92         92         184
2          84         84         168
3          76         76         152
4         100        100         200
5         100        100         200
6         100        100         200
7         116        116         232
8          88         88         176
9          80         80         160
10          84         84         168
11          88         88         176
12          80         80         160
13          92         92         184
14          76         76         152
15          76         76         152
16          84         84         168
17          72         72         144
18          44         44          88
19         152          0         152
20         168          0         168
21         136          0         136
22         144          0         144
23         152          0         152
24         120          0         120
25          96          0          96
26         168          0         168
27         192          0         192
28         136          0         136
29         160          0         160
30          88          0          88
31         168          0         168
32         160          0         160
33         168          0         168
34         216          0         216
35         192          0         192
36         184          0         184
37          96          0          96
38         176          0         176
39         144          0         144
40         176          0         176
41         192          0         192
42         128          0         128
43         152          0         152
44         104          0         104
45         192          0         192
46         144          0         144

Total       5,836      1,532       7,368


Because the hospitals 1–18 have observations on both the Old and New procedures, we can assign them to the treated group. Because hospitals 19–46 only have observations on the Old procedure, we can assign them to the control group.

We can also check time versus treatment as in this table,

. use https://www.stata-press.com/data/r17/hospdd, clear
(Artificial hospital admission procedure data)

. tabulate month procedure

Month         Old        New       Total

January       1,842          0       1,842
February         921          0         921
March         921          0         921
April         538        383         921
May         538        383         921
June         538        383         921
July         538        383         921

Total       5,836      1,532       7,368


This table reports that all observations are on the Old procedure before April, and that some observations are on the Old procedure and some are on the New procedure beginning in April.

Note that the explanations above also apply to xtdidregress, which handles panel/longitudinal data.