Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

re:Re: st: Regression Discontinuity Design

From   "Ariel Linden, DrPH" <>
To   <>
Subject   re:Re: st: Regression Discontinuity Design
Date   Sun, 9 Oct 2011 12:23:55 -0400

First off, I apologize that my initial response was more-or-less the same as
Austin's. I get the digest, so my responses are always a day "after the

On the other hand, Austin and I both had identical concerns, so that should
give you some comfort (at least it does for me) :-)

A couple of additional points:

Austin is correct in suggesting that if your reviewers are concerned with
the impact of "unobservables", you may want to consider the IV approach.
However, the main problem with the IV approach (at least in my opinion), is
actually finding a good IV. Austin suggested the "distance to clinic", which
may be a good IV if you have it. Under this approach, you may have
sufficient sample size if the unit of measure is the individual. You'd have
to test other potential IVs accordingly...

My initial suggested approach using a multi-level approach was to allow for
clustering at the various levels. However, this will not adjust for
unobservables, and so the reviewers' concerns are not addressed.

Propensity score matching (or weighting) would only ensure balance on
observables, and so the reviewers' concerns about the confounding trend in
income, etc. remains a viable threat to validity.

Perhaps a viable alternative approach would be to model the data
longitudinally as either a time series (at the community level), or
longitudinally using individual level data. I am not clear what your data
look like, but if you have multiple time points, you could perhaps account
for the differing "trends" you spoke of.

Good luck!


Date: Fri, 7 Oct 2011 16:57:54 +0200
From: Nyasha Tirivayi <>
Subject: Re: st: Regression Discontinuity Design

Dear Ariel

Quick answers to your questions:
a) X variable for assignment is HIV prevalence rate at community level
b) cut off is 22%
c) yes I am also worried n=8 at community level is not sufficient

My outcome is labour supply for individuals in 200 treated households
(residing in 4 chosen communities) and 200 control households
(residing in 4 control communities). However does this still seem as
randomization at community level if program placement was non-random
i.e. they specifically targeted communities with higher HIV rates
(above 22%). Household recruitment was not randomized either.

In that case can I use multilevel modelling? I had done propensity
score matching, but reviewers feel there are unobservables I am
overlooking. So with cross sectional data, what other methods can I
plausibly use?

Kindly advise


Nyasha Tirivayi
Maastricht University

On Fri, Oct 7, 2011 at 4:33 PM, Ariel Linden, DrPH
<> wrote:
> Hi Nyasha,
> It seems like you've got several different things going on here at once.
> RD design can be thought of as an observational study equivalent of an RCT
> (where the cutoff represents the randomization). If we think about it in
> those simple terms, then I'd ask you this: (a) what is the X variable that
> you'd be using for assignment, (b) what would be the cutoff, and (c) do
> think that a N=8 is reasonable?
> It is not clear from your description what either (a) or (b) is, but I can
> certainly say without any hesitation that N=8 is not sufficient.
> An excellent recent article for you to read on the RD design is: Lee,
> Lemieux, T. (2010) Regression discontinuity designs in econometrics.
> of Economic Literature 48, 281-355.
> Without getting into too deep of a methodological discussion, it seems to
> that if you already have randomization at the community level, you should
> consider hierarchical/multi-level modeling to tease out whatever effect
> are looking for.
> Ariel
> Date: Fri, 7 Oct 2011 00:39:17 +0200
> From: Nyasha Tirivayi <>
> Subject: st: Regression Discontinuity Design
> Hello
> I have questions about implementing a regression discontinuity
> approach. I have cross sectional data from 200 households on a social
> program and 200 control households. The program was targeted at two
> levels- geographically and at household level.
> The geographic placement of the social program in communities appears
> to have been done based on HIV prevalence rates of more than 20.5% for
> 3 "treated" communities and less than 20.5% for 3 "control
> communities". Two clinics do not follow this cutoff making it a fuzzy
> discontinuity design at community level. After geographic placement,
> households were then selected based on a means tested score. However
> we do not have access to this data. We have data from 200 randomly
> sampled households who are actually in the social program and residing
> in the treated communities and from 200 control households with
> similar household characteristics to the treated households but
> residing in the control communities.
> My questions are as follows:
> 1. Would it be valid to use the community level discontinuity for
> impact evaluation? What software can I use in Stata?
> 2. If so would an RD approach based on 8 communities be valid? Is the
> sample of communities too small?
> 3. If RD is no appropriate what other methods besides propensity score
> matching can I use, that can also take care of unobservables even with
> cross sectional data?
> Kindly advise
> Regards
> N.Tirivayi
> Maastricht University
> Netherlands

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index