Home  /  Products  /  Features  /  Linear regression with endogenous treatment effects

Stata’s etregress allows you to estimate an average treatment effect (ATE) and the other parameters of a linear regression model augmented with an endogenous binary-treatment variable. You just specify the treatment variable and the treatment covariates in the treat() option. The average treatment effect on the treated (ATET) can also be estimated with etregress.

We estimate the ATE of being a union member on wages of women with etregress. Other outcome covariates include wage, school, grade, and tenure. Indicators for living in an SMSA—standard metropolitan statistical area, being African American, and living in the southern region of the United States are also used as outcome covariates.

. webuse union3
(NLS Women 14-24 in 1968)

. etregress wage age grade smsa black tenure, treat(union = south black tenure)

Iteration 0:  Log likelihood =  -3140.811  
Iteration 1:  Log likelihood = -3053.6629  
Iteration 2:  Log likelihood = -3051.5847  
Iteration 3:  Log likelihood =  -3051.575  
Iteration 4:  Log likelihood =  -3051.575  

Linear regression with endogenous treatment             Number of obs =  1,210
Estimator: Maximum likelihood                           Wald chi2(6)  = 681.89
Log likelihood = -3051.575                              Prob > chi2   = 0.0000

Coefficient Std. err. z P>|z| [95% conf. interval]
wage
age .1487409 .0193291 7.70 0.000 .1108566 .1866252
grade .4205658 .0293577 14.33 0.000 .3630258 .4781058
smsa .9117044 .1249041 7.30 0.000 .6668969 1.156512
black -.7882471 .1367078 -5.77 0.000 -1.05619 -.5203048
tenure .1524015 .0369596 4.12 0.000 .0799621 .2248409
1.union 2.945815 .2749621 10.71 0.000 2.4069 3.484731
_cons -4.351572 .5283952 -8.24 0.000 -5.387208 -3.315936
union
south -.5807419 .0851111 -6.82 0.000 -.7475566 -.4139271
black .4557499 .0958042 4.76 0.000 .2679771 .6435226
tenure .0871536 .0232483 3.75 0.000 .0415878 .1327195
_cons -.8855758 .0724506 -12.22 0.000 -1.027576 -.7435753
/athrho -.6544347 .0910314 -7.19 0.000 -.832853 -.4760164
/lnsigma .7026769 .0293372 23.95 0.000 .645177 .7601767
rho -.5746478 .060971 -.682005 -.4430476
sigma 2.019151 .0592362 1.906325 2.138654
lambda -1.1603 .1495097 -1.453334 -.8672668
LR test of indep. eqns. (rho = 0): chi2(1) = 19.84 Prob > chi2 = 0.0000

The estimated ATE of being a union member is 2.95. The ATET is the same as the ATE in this case because the treatment indicator variable has not been interacted with any of the outcome covariates.

When there is a treatment variable and outcome covariate interaction, the parameter estimates from etregress can be used by margins to estimate the ATE. Now we use factor-variable notation to allow the tenure and black coefficients to vary based on union membership. We specify the vce(robust) because we need to specify vce(unconditional) when we use margins below.

. etregress wage age grade south i.union#c.(black tenure), 
     treat(union = south black tenure) vce(robust)

Iteration 0:   Log pseudolikelihood = -3093.9289  
Iteration 1:   Log pseudolikelihood = -3069.8014  
Iteration 2:   Log pseudolikelihood = -3069.0214  
Iteration 3:   Log pseudolikelihood = -3069.0106  
Iteration 4:   Log pseudolikelihood = -3069.0106  

Linear regression with endogenous treatment             Number of obs =  1,210
Estimator: Maximum likelihood                           Wald chi2(8)  = 445.85
Log pseudolikelihood = -3069.0106                       Prob > chi2   = 0.0000

Robust
Coefficient std. err. z P>|z| [95% conf. interval]
wage
age .1547605 .020634 7.50 0.000 .1143186 .1952025
grade .4328724 .0372888 11.61 0.000 .3597876 .5059572
south -.5060951 .2009611 -2.52 0.012 -.8999716 -.1122186
union#c.black
0 -.4695395 .2009365 -2.34 0.019 -.8633677 -.0757112
1 -.8580219 .2893336 -2.97 0.003 -1.425105 -.2909385
union#c.tenure
0 .1802719 .0545018 3.31 0.001 .0734504 .2870934
1 .0848265 .0929442 0.91 0.361 -.0973408 .2669938
union
1 3.060777 .9504098 3.22 0.001 1.198008 4.923546
_cons -3.847881 .6560055 -5.87 0.000 -5.133628 -2.562133
union
south -.5041281 .0932344 -5.41 0.000 -.6868642 -.321392
black .4506167 .0953425 4.73 0.000 .2637489 .6374845
tenure .0917203 .0260037 3.53 0.000 .040754 .1426867
_cons -.9325238 .0811249 -11.49 0.000 -1.091526 -.7735219
/athrho -.5750886 .3420724 -1.68 0.093 -1.245538 .095361
/lnsigma .6978439 .0973047 7.17 0.000 .5071302 .8885576
rho -.5190865 .2499007 -.8470277 .095073
sigma 2.009416 .1955256 1.660519 2.43162
lambda -1.043061 .5904939 -2.200407 .1142862
Wald test of indep. eqns. (rho = 0): chi2(1) = 2.83 Prob > chi2 = 0.0927

The ATE of union membership is now estimated with margins. The “r.” notation tells margins to contrast the potential-outcome means for the treatment and control regimes.

. margins r.union, vce(unconditional) contrast(nowald)

Contrasts of predictive margins                          Number of obs = 1,210

Expression: Linear prediction, predict()

Unconditional
Contrast std. err. [95% conf. interval]
union
(1 vs 0) 2.772613 .9382248 .933726 4.6115

The estimate of the ATE is essentially the same as in the original model. Now we estimate the ATET of union membership with margins. We specify union in the subpop() option to restrict estimation to the treated subpopulation.

. margins r.union, vce(unconditional) contrast(nowald) subpop(union)

Contrasts of predictive margins                        Number of obs   = 1,210
                                                       Subpop. no. obs =   253

Expression: Linear prediction, predict()

Unconditional
Contrast std. err. [95% conf. interval]
union
(1 vs 0) 2.70409 .9415886 .8586099 4.549569

The estimated ATET and ATE are close, indicating that the average predicted outcome for the treatment group is similar to the average predicted outcome for the whole population.