Home  /  Products  /  Features  /  Linear regression with endogenous treatment effects

Linear regression with endogenous treatment effects

Stata’s etregress allows you to estimate an average treatment effect (ATE) and the other parameters of a linear regression model augmented with an endogenous binary-treatment variable. You just specify the treatment variable and the treatment covariates in the treat() option. The average treatment effect on the treated (ATET) can also be estimated with etregress.

We estimate the ATE of being a union member on wages of women with etregress. Other outcome covariates include wage, school, grade, and tenure. Indicators for living in an SMSAâ€”standard metropolitan statistical area, being African American, and living in the southern region of the United States are also used as outcome covariates.

. webuse union3
(NLS Women 14-24 in 1968)

. etregress wage age grade smsa black tenure, treat(union = south black tenure)

Iteration 0:  Log likelihood =  -3140.811
Iteration 1:  Log likelihood = -3053.6629
Iteration 2:  Log likelihood = -3051.5847
Iteration 3:  Log likelihood =  -3051.575
Iteration 4:  Log likelihood =  -3051.575

Linear regression with endogenous treatment             Number of obs =  1,210
Estimator: Maximum likelihood                           Wald chi2(6)  = 681.89
Log likelihood = -3051.575                              Prob > chi2   = 0.0000

Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

wage
age     .1487409   .0193291     7.70   0.000     .1108566    .1866252
grade      .4205658   .0293577    14.33   0.000     .3630258    .4781058
smsa      .9117044   .1249041     7.30   0.000     .6668969    1.156512
black     -.7882471   .1367078    -5.77   0.000     -1.05619   -.5203048
tenure      .1524015   .0369596     4.12   0.000     .0799621    .2248409
1.union      2.945815   .2749621    10.71   0.000       2.4069    3.484731
_cons     -4.351572   .5283952    -8.24   0.000    -5.387208   -3.315936

union
south    -.5807419   .0851111    -6.82   0.000    -.7475566   -.4139271
black     .4557499   .0958042     4.76   0.000     .2679771    .6435226
tenure     .0871536   .0232483     3.75   0.000     .0415878    .1327195
_cons    -.8855758   .0724506   -12.22   0.000    -1.027576   -.7435753

/athrho    -.6544347   .0910314    -7.19   0.000     -.832853   -.4760164
/lnsigma     .7026769   .0293372    23.95   0.000      .645177    .7601767

rho    -.5746478    .060971                      -.682005   -.4430476
sigma     2.019151   .0592362                      1.906325    2.138654
lambda      -1.1603   .1495097                     -1.453334   -.8672668

LR test of indep. eqns. (rho = 0):   chi2(1) =    19.84   Prob > chi2 = 0.0000


The estimated ATE of being a union member is 2.95. The ATET is the same as the ATE in this case because the treatment indicator variable has not been interacted with any of the outcome covariates.

When there is a treatment variable and outcome covariate interaction, the parameter estimates from etregress can be used by margins to estimate the ATE. Now we use factor-variable notation to allow the tenure and black coefficients to vary based on union membership. We specify the vce(robust) because we need to specify vce(unconditional) when we use margins below.

. etregress wage age grade south i.union#c.(black tenure),
treat(union = south black tenure) vce(robust)

Iteration 0:   Log pseudolikelihood = -3093.9289
Iteration 1:   Log pseudolikelihood = -3069.8014
Iteration 2:   Log pseudolikelihood = -3069.0214
Iteration 3:   Log pseudolikelihood = -3069.0106
Iteration 4:   Log pseudolikelihood = -3069.0106

Linear regression with endogenous treatment             Number of obs =  1,210
Estimator: Maximum likelihood                           Wald chi2(8)  = 445.85
Log pseudolikelihood = -3069.0106                       Prob > chi2   = 0.0000

Robust
Coefficient  std. err.      z    P>|z|     [95% conf. interval]

wage
age     .1547605    .020634     7.50   0.000     .1143186    .1952025
grade     .4328724   .0372888    11.61   0.000     .3597876    .5059572
south    -.5060951   .2009611    -2.52   0.012    -.8999716   -.1122186

union#c.black
0     -.4695395   .2009365    -2.34   0.019    -.8633677   -.0757112
1     -.8580219   .2893336    -2.97   0.003    -1.425105   -.2909385

union#c.tenure
0      .1802719   .0545018     3.31   0.001     .0734504    .2870934
1      .0848265   .0929442     0.91   0.361    -.0973408    .2669938

union
1      3.060777   .9504098     3.22   0.001     1.198008    4.923546

_cons    -3.847881   .6560055    -5.87   0.000    -5.133628   -2.562133

union
south    -.5041281   .0932344    -5.41   0.000    -.6868642    -.321392
black     .4506167   .0953425     4.73   0.000     .2637489    .6374845
tenure     .0917203   .0260037     3.53   0.000      .040754    .1426867
_cons    -.9325238   .0811249   -11.49   0.000    -1.091526   -.7735219

/athrho    -.5750886   .3420724    -1.68   0.093    -1.245538     .095361
/lnsigma     .6978439   .0973047     7.17   0.000     .5071302    .8885576

rho    -.5190865   .2499007                     -.8470277     .095073
sigma     2.009416   .1955256                      1.660519     2.43162
lambda    -1.043061   .5904939                     -2.200407    .1142862

Wald test of indep. eqns. (rho = 0): chi2(1) =     2.83   Prob > chi2 = 0.0927


The ATE of union membership is now estimated with margins. The “r.” notation tells margins to contrast the potential-outcome means for the treatment and control regimes.

. margins r.union, vce(unconditional) contrast(nowald)

Contrasts of predictive margins                          Number of obs = 1,210

Expression: Linear prediction, predict()

Unconditional
Contrast   std. err.     [95% conf. interval]

union
(1 vs 0)      2.772613   .9382248       .933726      4.6115



The estimate of the ATE is essentially the same as in the original model. Now we estimate the ATET of union membership with margins. We specify union in the subpop() option to restrict estimation to the treated subpopulation.

. margins r.union, vce(unconditional) contrast(nowald) subpop(union)

Contrasts of predictive margins                        Number of obs   = 1,210
Subpop. no. obs =   253

Expression: Linear prediction, predict()

Unconditional
Contrast   std. err.     [95% conf. interval]

union
(1 vs 0)       2.70409   .9415886      .8586099    4.549569



The estimated ATET and ATE are close, indicating that the average predicted outcome for the treatment group is similar to the average predicted outcome for the whole population.