Home  /  Resources & support  /  FAQs  /  Error: Invalid group specification

Why do I get an “invalid group specification r(198)” error when I fit the standard difference-in-differences (DID) model by using didregress or xtdidregress?

Title   The relationship between the group variable and the treatment dummy variable in didregress and xtdidregress
Author Pei-Chun Lai, StataCorp

The standard DID model for repeated cross-sectional data fit by didregress is given by

$$Y_{ist}=γ_s+γ_t+z_{ist}\beta+D_{st}\delta+\varepsilon_{ist}$$

where \(i\) is the observation-level index, \(s\) is the group-level index, and \(t\) is the time-level index. \(γ_s\) denotes the group effects, \(γ_t\) denotes the time effects, and \(z_{ist}\) are the observation-level characteristics. The parameter of interest in this model is the average treatment effect on the treated (ATET), which is given by the (\(\delta\)). Note that there is no index (i) in the variable (\(D_{st}\)) because at a given point in time "all units" in group (s) are either subject to the treatment or are not. For example, if a specific tax policy is implemented in some states and not in others, all individuals in state (s) at a given point in time (t) are either subject to the tax policy (the treatment) or are not.

The standard DID models are generally used to study the effect of a policy in a setting where there is a world before and after the policy when controlling for group (\(\gamma_s\)) and time effects (\(\gamma_t\)). \(D_{st}\) is a binary variable that indicates the treated observations. For our state and time example, \(D_{st}=0\) for either the state in the treated group or the state in the control group before the treatment occurs for the first time. After the treatment occurs, \(D_{st}=1\) for the states in the treated group, but \(D_{st}=0\) for the states in the control group. \(D_{st}\) is put in the second set of parentheses of didregress. For didregress to work, we should have a control group that has at least one state for which all observations have \(D_{st}=0\) over years.

If no states have \(D_{st}=0\) over years, and all states have both observations of \(D_{st}=0\) and \(D_{st}=1\),

treatment
State No treat Yes treat Total
0 405 1,029 1,434
1 2,343 2,265 4,608
2 784 973 1,757
Total 3,532 4,267 7,799

we cannot assign any state to the control group, and we will get the error message

invalid group specification
None of the groups defined by state is a control.
r(198)

after running didregress. In other words, we do not have a state that is never treated and that is in the control group in the sample.

If we examine the hospdd.dta used in the examples of the PDF manual entry for didregress, the group versus treatment table is

. use https://www.stata-press.com/data/r18/hospdd, clear
(Artificial hospital admission procedure data)

. tabulate hospital procedure

Hospital Admission procedure
ID Old New Total
1 92 92 184
2 84 84 168
3 76 76 152
4 100 100 200
5 100 100 200
6 100 100 200
7 116 116 232
8 88 88 176
9 80 80 160
10 84 84 168
11 88 88 176
12 80 80 160
13 92 92 184
14 76 76 152
15 76 76 152
16 84 84 168
17 72 72 144
18 44 44 88
19 152 0 152
20 168 0 168
21 136 0 136
22 144 0 144
23 152 0 152
24 120 0 120
25 96 0 96
26 168 0 168
27 192 0 192
28 136 0 136
29 160 0 160
30 88 0 88
31 168 0 168
32 160 0 160
33 168 0 168
34 216 0 216
35 192 0 192
36 184 0 184
37 96 0 96
38 176 0 176
39 144 0 144
40 176 0 176
41 192 0 192
42 128 0 128
43 152 0 152
44 104 0 104
45 192 0 192
46 144 0 144
Total 5,836 1,532 7,368

Because the hospitals 1–18 have observations on both the Old and New procedures, we can assign them to the treated group. Because hospitals 19–46 only have observations on the Old procedure, we can assign them to the control group.

We can also check time versus treatment as in this table,

. use https://www.stata-press.com/data/r18/hospdd, clear
(Artificial hospital admission procedure data)

. tabulate month procedure

Admission procedure
Month Old New Total
January 1,842 0 1,842
February 921 0 921
March 921 0 921
April 538 383 921
May 538 383 921
June 538 383 921
July 538 383 921
Total 5,836 1,532 7,368

This table reports that all observations are on the Old procedure before April, and that some observations are on the Old procedure and some are on the New procedure beginning in April.

Note that the explanations above also apply to xtdidregress, which handles panel/longitudinal data.