Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Instrumenting district level treatment in household level dif-in-dif regression


From   "Verpoorten, Marijke" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: Instrumenting district level treatment in household level dif-in-dif regression
Date   Tue, 29 Mar 2011 10:33:05 +0000

It would be great if I could get some help with the following problem:
Suppose you want to analyze the impact of a district level treatment on household behavior. You have information from two repeated nationwide cross-sections (one pre-treatment and one post-treatment, so NO PANEL) and you have information on the treatment at the district level. Households are denoted by j, districts by i, and survey year by t. The outcome variable for household j in district i in year t is outcome_ijt.
The best way to go about is a dif-in-dif where the estimated coefficient on the interaction term (treatment_i*post-treatment survey round_t) would give you the coefficient of interest (b).
Outcome_ijt = a + b(treatment_i*post-treatment survey round_t)+c(treatment_i)+d(post_treatment survey_t)+f(household_control_vars_ijt)+e_ijt
Suppose now that the treatment is endogenous. Then, the approach would be to instrument for the treatment. This brings me to the following questions:

(1)    Which of the following options is (most) appropriate:

a.       Instrument both the interaction term (treatment_i*post-treatment survey round) and the component (treatment_i) using ivreg2 and two instruments

b.      Instrument only the interaction term using ivreg2 and one instrument (if second instrument  is not available or not appropriate)

c.       Use 2SLS and first instrument for treatment_i and then plug in the predicted value of treatment_i both in itself and in the interaction term in the second stage.



(2)

a.        Given that the treatment is at the district level, is it actually appropriate to use ivreg2, which then comes down to regressing a district level variable (treatment_i) on household level variables (household_control_vars_ijt)?

b.      Or, should I use 2SLS and run the first stage regression with only district level variables at the RHS (including the district average of household_control_vars_ijt).

c.       In that case, should I then use all observations or one observation per district?
(3)
If the outcome is binary, there is the additional complication that the standard tests for over-identifying restrictions are not appropriate. Can I then use the usual manual Sargan test?

Thanks a lot in advance
Marijke

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index