Search
   >> Home >> Products >> Features >> Overview >> Linear regression with endogenous treatment effects

Linear regression with endogenous treatment effects


Stata’s etregress command allows you to estimate an average treatment effect (ATE) and the other parameters of a linear regression model augmented with an endogenous binary-treatment variable. You just specify the treatment variable and the treatment covariates in the treat() option. The average treatment effect on the treated (ATET) can also be estimated with etregress.

We estimate the ATE of being a union member on wages of women with etregress. Other outcome covariates include wage, school, grade, and tenure. Indicators for living in an SMSA—standard metropolitan statistical area, being African American, and living in the southern region of the United States are also used as outcome covariates.

. webuse union3 (National Longitudinal Survey. Young Women 14-26 years of age in 1968) . etregress wage age grade smsa black tenure, treat(union = south black tenure) Iteration 0: log likelihood = -3097.9871 Iteration 1: log likelihood = -3052.5988 Iteration 2: log likelihood = -3051.5789 Iteration 3: log likelihood = -3051.575 Iteration 4: log likelihood = -3051.575 Linear regression with endogenous treatment Number of obs = 1210 Estimator: maximum likelihood Wald chi2(6) = 681.89 Log likelihood = -3051.575 Prob > chi2 = 0.0000
Coef. Std. Err. z P>|z| [95% Conf. Interval]
wage
age .1487409 .0193291 7.70 0.000 .1108566 .1866252
grade .4205658 .0293577 14.33 0.000 .3630258 .4781058
smsa .9117045 .1249041 7.30 0.000 .6668969 1.156512
black -.7882471 .1367078 -5.77 0.000 -1.056189 -.5203047
tenure .1524015 .0369596 4.12 0.000 .0799621 .2248409
union 2.945815 .2749624 10.71 0.000 2.406898 3.484731
_cons -4.351572 .5283952 -8.24 0.000 -5.387208 -3.315936
union
south -.5807419 .0851111 -6.82 0.000 -.7475567 -.4139271
black .4557499 .0958042 4.76 0.000 .2679772 .6435226
tenure .0871536 .0232483 3.75 0.000 .0415878 .1327195
_cons -.8855759 .0724506 -12.22 0.000 -1.027576 -.7435754
/athrho -.6544344 .0910315 -7.19 0.000 -.8328529 -.4760159
/lnsigma .7026768 .0293372 23.95 0.000 .645177 .7601767
rho -.5746476 .0609711 -.6820049 -.4430472
sigma 2.01915 .0592362 1.906324 2.138654
lambda -1.1603 .1495099 -1.453334 -.867266
LR test of indep. eqns. (rho = 0): chi2(1) = 19.84 Prob > chi2 = 0.0000

The estimated ATE of being a union member is 2.95. The ATET is the same as the ATE in this case because the treatment indicator variable has not been interacted with any of the outcome covariates.

When there is a treatment variable and outcome covariate interaction, the parameter estimates from etregress can be used by margins to estimate the ATE. Now we use factor-variable notation to allow the tenure and black coefficients to vary based on union membership. We specify the vce(robust) because we need to specify vce(unconditional) when we use margins below.

. etregress wage age grade south i.union#c.(black tenure), treat(union = south black tenure) vce(robust) Iteration 0: log pseudolikelihood = -3093.9289 Iteration 1: log pseudolikelihood = -3069.8013 Iteration 2: log pseudolikelihood = -3069.0214 Iteration 3: log pseudolikelihood = -3069.0106 Iteration 4: log pseudolikelihood = -3069.0106 Linear regression with endogenous treatment Number of obs = 1210 Estimator: maximum likelihood Wald chi2(8) = 445.85 Log pseudolikelihood = -3069.0106 Prob > chi2 = 0.0000
Robust
Coef. Std. Err. z P>|z| [95% Conf. Interval]
wage
age .1547605 .020634 7.50 0.000 .1143186 .1952025
grade .4328724 .0372889 11.61 0.000 .3597876 .5059572
south -.5060952 .2009613 -2.52 0.012 -.8999722 -.1122182
union#c.black
0 -.4695394 .2009367 -2.34 0.019 -.863368 -.0757108
1 -.8580218 .2893338 -2.97 0.003 -1.425106 -.290938
union#
c.tenure
0 .180272 .0545018 3.31 0.001 .0734504 .2870935
1 .0848265 .0929442 0.91 0.361 -.0973408 .2669938
union 3.060776 .9504121 3.22 0.001 1.198003 4.92355
_cons -3.847881 .6560055 -5.87 0.000 -5.133628 -2.562133
union
south -.5041281 .0932344 -5.41 0.000 -.6868642 -.3213921
black .4506167 .0953425 4.73 0.000 .2637489 .6374845
tenure .0917203 .0260037 3.53 0.000 .040754 .1426867
_cons -.9325238 .081125 -11.49 0.000 -1.091526 -.7735218
/athrho -.5750884 .3420731 -1.68 0.093 -1.245539 .0953626
/lnsigma .6978439 .0973048 7.17 0.000 .5071299 .8885579
rho -.5190864 .2499013 -.8470281 .0950746
sigma 2.009415 .1955259 1.660518 2.43162
lambda -1.04306 .5904952 -2.20041 .1142891
Wald test of indep. eqns. (rho = 0): chi2(1) = 5.17 Prob > chi2 = 0.0230

The ATE of union membership is now estimated with margins. The “r.” notation tells margins to contrast the potential-outcome means for the treatment and control regimes.

. margins r.union, vce(unconditional) contrast(nowald) Contrasts of predictive margins Expression : Linear prediction, predict()
Unconditional
Contrast Std. Err. [95% Conf. Interval]
union
(1 vs 0) 2.772612 .9382272 .9337209 4.611504

The estimate of the ATE is essentially the same as in the original model. Now we estimate the ATET of union membership with margins. We specify union in the subpop() option to restrict estimation to the treated subpopulation.

. margins r.union, vce(unconditional) contrast(nowald) subpop(union) Contrasts of predictive margins Expression : Linear prediction, predict()
Unconditional
Contrast Std. Err. [95% Conf. Interval]
union
(1 vs 0) 2.704089 .9415909 .8586049 4.549573

The estimated ATET and ATE are close, indicating that the average predicted outcome for the treatment group is similar to the average predicted outcome for the whole population.

The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube