Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Regression Discontinuity Design
Austin Nichols <firstname.lastname@example.org>
Re: st: Regression Discontinuity Design
Fri, 7 Oct 2011 13:42:48 -0400
Nyasha Tirivayi <email@example.com>
I said in my first reply: "There are IV methods one might use,
perhaps based on distance to clinic...." We would also need to know every
bit of information in your data, and what other data might be matched onto it,
to tell you what can be done. Perhaps the best approach is to recruit
a coauthor who can help you brainstorm another method.
On Fri, Oct 7, 2011 at 11:28 AM, Nyasha Tirivayi <firstname.lastname@example.org> wrote:
> Hi Austin
> I mean baseline employment rates (obtained retrospectively) are higher
> for control individuals than for treated individuals. What we were
> told by the program staff was that community selection was done based
> on HIV rates of above 22%. But as you can see, one treated community
> is below 22% and one control community is above 22%.
> If I cannot use RDD, what other methods can I use instead of
> propensity score matching? My outcome is labour supply measured as
> weekly hours, cross sectional data.
> May you kindly advise
> Nyasha Tirivayi
> Maastricht University
> On Fri, Oct 7, 2011 at 5:10 PM, Austin Nichols <email@example.com> wrote:
>> Nyasha Tirivayi <firstname.lastname@example.org> :
>> What do you mean, "baseline labour supply rates for the treated sample
>> (68%) are lower than from the control group (57%)"
>> fwiw, I see no evidence of a discontinuity:
>> input T Community N HIVrate
>> 1 1 103 22.5
>> 1 2 120 22.6
>> 1 3 122 22.5
>> 1 4 129 20.3
>> 0 5 124 18.5
>> 0 6 140 20.4
>> 0 7 126 18.5
>> 0 8 138 23.9
>> sc T HIVrate [aw=N]
>> On Fri, Oct 7, 2011 at 10:15 AM, Nyasha Tirivayi <email@example.com> wrote:
>>> Hi Austin
>>> Thank you so much for the response. I am trying to estimate the impact
>>> of a social program on intrahousehold labour supply. Hence I have
>>> labour supply data at individual level. In total I have 474
>>> individuals from 200 treated households (residing in 4 treated
>>> communities) and 532 individuals from 200 control households (residing
>>> in 4 control communities).
>>> I had initially done propensity score matching. However baseline
>>> labour supply rates for the treated sample (68%) are lower than from
>>> the control group (57%). Once comment I have received is that the
>>> possibility of differential trends in labor market outcomes across
>>> program and non-program communities implies that any observed
>>> differences are not reliable measures of the effects of the food
>>> program. Hence journal reviewers are concerned about the possibility
>>> of unobservables and suggested a regression discontinuity approach (if
>>> possible) or within community estimates.
>>> CommunityHouseholdsAdult Individuals Community HIV rate
>>> 1 50 103 22.5
>>> 2 50 120 22.6
>>> 3 50 122 22.5
>>> 4 50 129 20.3
>>> 1 50 124 18.5
>>> 2 50 140 20.4
>>> 3 50 126 18.5
>>> 4 50 138 23.9
>>> On Fri, Oct 7, 2011 at 1:55 PM, Austin Nichols <firstname.lastname@example.org> wrote:
>>>> Nyasha Tirivayi <email@example.com>
>>>> You do not have a good RD design, partly because you do not appear to
>>>> be confident of the existence of a discontinuity in treatment, but
>>>> mainly because you do not have adequate sample size. 6 communities
>>>> are hypothesized to lie on either side of the cutoff; if assumptions
>>>> are correct, communities close to the cutoff can be treated as being
>>>> randomly assigned treatment. People in those communities can also be
>>>> treated as being randomly assigned treatment under the stronger
>>>> assumption that community is fixed and people do not change community.
>>>> But you do not have 400 observations on the assignment variable with
>>>> which to construct a local linear regression of the effect of the
>>>> assignment variable on treatment; you have 6. The problem here is that
>>>> you will really want to cluster on community, but you cannot cluster
>>>> when you have 6 clusters (and when you cluster in the first stage, you
>>>> really only have 6 obs, not 400). Even 400 obs probably would not be
>>>> enough to identify any reasonably small effect using an RD method,
>>>> which needs a very large sample size to work well. The first thing to
>>>> do in such cases, if you are not sure how much power you might have,
>>>> is to run a quick simulation. There are IV methods one might use,
>>>> perhaps based on distance to clinic, but you are not really explicit
>>>> about what your estimand is. What are you trying to estimate? What
>>>> is the outcome variable?
>>>> On Thu, Oct 6, 2011 at 6:39 PM, Nyasha Tirivayi <firstname.lastname@example.org> wrote:
>>>>> I have questions about implementing a regression discontinuity
>>>>> approach. I have cross sectional data from 200 households on a social
>>>>> program and 200 control households. The program was targeted at two
>>>>> levels- geographically and at household level.
>>>>> The geographic placement of the social program in communities appears
>>>>> to have been done based on HIV prevalence rates of more than 20.5% for
>>>>> 3 "treated" communities and less than 20.5% for 3 "control
>>>>> communities". Two clinics do not follow this cutoff making it a fuzzy
>>>>> discontinuity design at community level. After geographic placement,
>>>>> households were then selected based on a means tested score. However
>>>>> we do not have access to this data. We have data from 200 randomly
>>>>> sampled households who are actually in the social program and residing
>>>>> in the treated communities and from 200 control households with
>>>>> similar household characteristics to the treated households but
>>>>> residing in the control communities.
>>>>> My questions are as follows:
>>>>> 1. Would it be valid to use the community level discontinuity for
>>>>> impact evaluation? What software can I use in Stata?
>>>>> 2. If so would an RD approach based on 8 communities be valid? Is the
>>>>> sample of communities too small?
>>>>> 3. If RD is no appropriate what other methods besides propensity score
>>>>> matching can I use, that can also take care of unobservables even with
>>>>> cross sectional data?
>>>>> Kindly advise
>>>>> Maastricht University
* For searches and help try: