Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Regression Discontinuity Designs with rd package and binary outcome

From	Jeffrey Wooldridge <[email protected]>
To	[email protected]
Subject	Re: st: Regression Discontinuity Designs with rd package and binary outcome
Date	Wed, 16 Jan 2013 06:47:13 -0500

Agree it is not easy to get a standard error. I wanted to make sure
the poster understood there is no conceptual problem or issues with
consistency in using nonlinear models. It really is a standard error
calculation issue.

Haven't done simulation work. I probably would fix the bandwidths at
the same value used with the original data. That might be the most
realistic way of obtaining a standard error for the estimated that
uses the particular band width.

Jeff


On Tue, Jan 15, 2013 at 11:15 AM, Austin Nichols
<[email protected]> wrote:
> Jeff--
> I don't think we disagree.  The RD design allows many forms of a local
> Wald-type estimator, but the design of -rd- does not.  -rd- is a
> convenience command on SSC that estimates linear models only, using
> -ivreg- and/or -suest- and/or -lpoly- and/or other Stata machinery.
>
> One should not underestimate the difficulty of obtaining standard
> errors, nor of picking an appropriate bandwidth for an arbitrary
> combination of conditional expectation functions, and it is not
> immediately clear to me that one would want to impose a fixed
> bandwidth (from some ROT calculation) and then resample obs to get a
> bootstrap estimate for the SE of a ratio of two nonlinear regression
> coefs.  Have you simulated performance of alternatives for these
> cases?
>
> On Tue, Jan 15, 2013 at 10:50 AM, Jeffrey Wooldridge
> <[email protected]> wrote:
>> I'm going to have to differ with Austin on this one. The IV
>> characterization of the fuzzy RD design is useful but it is not how
>> one should evaluate whether the method can be extended to other kinds
>> of response models. In 2e of my MIT Press book I provide a general
>> treatment where the response functions can be virtually anything. In
>> particular, my equation (21.107) -- the ratio of estimated jumps in
>> the regression functions to the response probability -- allows for any
>> kind of mean estimation. One could easily do local logistic regression
>> for both the treatment and response, and even polynomial versions. The
>> difficulty is in obtaining standard errors, but the bootstrap can be
>> used if one does not want to do the delta method.
>>
>> If y has any features one would like to account for -- for example, it
>> could be a count variable -- then this can be done in an RD framework.
>> For example, I would use local Poisson regression coupled with local
>> logit estimation for the treatment.
>>
>> JW
>>
>> On Mon, Jan 14, 2013 at 10:36 AM, Austin Nichols
>> <[email protected]> wrote:
>>> Philippe Van Kerm asked me that same question in 2008, and in the
>>> absence of any new insight, my answer is the same. The -rd- design can
>>> be thought of (and estimated) as a local IV model (-ivreg- with
>>> weights emphasizing obs close to the cutoff), where a binary treatment
>>> is instrumented by a dummy D for "Z>0 (assignment var above the
>>> cutoff)" while controlling for Z and DZ. It might make sense to
>>> estimate local logits in many cases, for the first stage since
>>> treatment is binary, or the second stage when the outcome is binary.
>>> But logits and IV do not mix well. You can write out a GMM form of
>>> local probits or logits or estimate a reweighted bivariate probit, but
>>> while the linear model works well in most cases even when variables
>>> are binary, the other models require functional form assumptions and
>>> may often introduce bias where the local linear model had negligible
>>> bias.
>>>
>>> There are cases that need special treatment, where the linear model
>>> does not work well, but then you have to switch to another model, and
>>> give up on -rd- which is designed around linear models only.
>>> Currently, there is only one fix for a failure of the linear model in
>>> -rd-, when predictions for mean treatment at the cutoff lie outside
>>> the feasible range (where you might want another link function) but
>>> the fix is just to switch to local mean smoothing (a zero degree
>>> polynomial), not to a logit or another model.
>>>
>>> On Sun, Jan 13, 2013 at 6:26 PM,  <[email protected]> wrote:
>>>> Hi,
>>>>
>>>> I am using a Regression Discontinuity Designs and Austin Nichols's rd command to estimate the effect of some outcome y on a treatment D, which is defined by a cutoff point c in some continuous variable x. My outcome variable y is binary. This is a pretty common situation for RDD. E.g. the classical incumbency advantage example often uses "winning in the next election" as one of the outcome variables (e.g. Lee and Lemieux 2010).
>>>>
>>>> Here is my question: I never read something about linear vs. logistic regression in the RDD literature (or the distribution of the outcome variable in general). Linear (or polynomial) local regressions are commonly used such as `lpoly` in the rd package. Why not some local logistic regression?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> Lee, David S., and Thomas Lemieux. 2010. “Regression Discontinuity Designs in Economics.” Journal of Economic Literature 48:281–355.
>>>>
>>>> rd package
>>>> http://ideas.repec.org/c/boc/bocode/s456888.html
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Regression Discontinuity Designs with rd package and binary outcome
  - From: [email protected]
- Re: st: Regression Discontinuity Designs with rd package and binary outcome
  - From: Austin Nichols <[email protected]>
- Re: st: Regression Discontinuity Designs with rd package and binary outcome
  - From: Jeffrey Wooldridge <[email protected]>
- Re: st: Regression Discontinuity Designs with rd package and binary outcome
  - From: Austin Nichols <[email protected]>

Prev by Date: Re: st: generate variable versus define scalar, with conditional statement
Next by Date: st: using GMM for no dynamic Panel dataset.
Previous by thread: Re: st: Regression Discontinuity Designs with rd package and binary outcome
Next by thread: st: How to identify multiple substrings within a string
Index(es):
- Date
- Thread