# st: interpret treatment effect coefficient

 From Antonio Macias To statalist@hsphsun2.harvard.edu Subject st: interpret treatment effect coefficient Date Wed, 5 Mar 2008 13:38:41 -0800 (PST)

```Hello,

What is the right interpretation after running a treatment effects regression?
(see my two specific questions Q1 and Q2 at the bottom of this email, after I explain what I am doing in Stata)

I run the analysis in four steps to explore what is the impact of "endogx" on "Y" and to assess the evidence of self selection  (that is, potential sample selection and potential endogeneity between endogx and Y).  ( I use the whole universe of firms reported in Compustat [a database for financial and accounting firms]).

1- First I run a simple OLS regression

regress Y  x1 x2 x3 x4 endogx z1 z2

I find negative significant coefficient (-.027) for endogx (insignifican coefficients for z1 and z2)  (thus, so far I have found only a negative relation, but not any causation evidence)

2- To assess the existence of sample selection  I run a heckman's regressions :

heckman Y  x1 x2 x3 x4, ///
select(endogx= x1 x2  z1 z2) twostep first

I find significant positive lambda (0.083); thus, evidence of sample selection. Yet, what I want to test is the impact of variable endogx in Y. Thus,

3- I run the treatment effect regression:

treatreg  Y x1 x2 x3 x4  , ///
treat (endogx= x1 x2  z1 z2  ) twostep first

and again find a significant evidence of sample selection (significant positive lambda [19.82]),  The coefficient for endogx in the main regression on Y as dependent variable is significant and negative(-31.8).

4- I then run the following code to estimate the difference in Y because of the treatment "endogx"

predict Yst, yctrt
predict Ynst, ycntrt
generate diff=Yst-Ynst
summarize diff

and I find a significant negative difference (-6.6).

Then, my two specific questions are:

Q1-What should be the right interpretation for the third regression?

A- Is the difference due to the existent sample selection and I am just estimating what is the different predicted value for Y between the population that is more prone to experience the treatment endogx vs the population that is not more prone to experience the treatment endogx?  (in other words, I would have estimated a relation originated from using two different population, as assessed by the sample-selection coefficient)

or B- The estimated predicted difference is an additional causation effect of endogx on Y, after controlling for sample selection and endogeneity?

C- Other interpretation? (maybe evidence of sample selection, but still an additional causation effect of endogx on Y?)

Q2- What is the difference between the coefficient in the third regression (-31) vs the estimated difference (-6.6)?

My main research question is to explain what other papers have found in a simple OLS regression (These other papers have found a significant negative coefficient of variable endogx on Y).  I and my coauthors have the assumption that the negative relation is originated because of a sample-selection bias (and also potential endogeneity bias when selecting whether to choose or not the endogx decision). In other words, at the moment of choosing endogx the firms already have lower Y values.

```