Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RD Optimal Bandwidth Algorithm sensitive to scaling?


From   Austin Nichols <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: RD Optimal Bandwidth Algorithm sensitive to scaling?
Date   Fri, 15 Jul 2011 15:27:32 -0400

Mark-
I'm at the Stata conf in Chicago and not back at work until Tue but I
suspect this is related to nbr of obs--the fifth root of N plays a big
role and auto.dta has 74 obs

On Friday, July 15, 2011, Schaffer, Mark E <[email protected]> wrote:
> There is something peculiar going on here...
>
> When I try to replicate Chris' example but using the sample votex
> dataset Austin provides with -rd-, I get no sensitivity to scaling.  But
> when I do it using the auto dataset as Chris does, I get the same
> sensitivity to scaling that he does.  In fact, if price is rescaled by a
> factor of 1,000,000 instead of Chris' 1,000, -rd- exits with an
> "insufficient observations" error!  Very curious....
>
> --Mark
>
> **************************************
>
> votex example:
>
> use votex, clear
> gen double LNE=lne/1000
> sum lne LNE d
> rd lne d, mbw(100)
> rd LNE d, mbw(100)
>
> Output:
>
>
> . sum lne LNE d
>
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>          lne |       349    21.32478    .4329206   19.65047    23.1144
>          LNE |       349    .0213248    .0004329   .0196505   .0231144
>            d |       349    .0502933    .1604194  -.2756163   .4696784
>
> . rd lne d, mbw(100)
> Two variables specified; treatment is
> assumed to jump from zero to one at Z=0.
>
>  Assignment variable Z is d
>  Treatment variable X_T unspecified
>  Outcome variable y is lne
>
> Estimating for bandwidth .29287775925349
> ------------------------------------------------------------------------
> ------
>          lne |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ------
>        lwald |  -.0773955   .1056062    -0.73   0.464      -.28438
> .1295889
> ------------------------------------------------------------------------
> ------
>
> . rd LNE d, mbw(100)
> Two variables specified; treatment is
> assumed to jump from zero to one at Z=0.
>
>  Assignment variable Z is d
>  Treatment variable X_T unspecified
>  Outcome variable y is LNE
>
> Estimating for bandwidth .2928777592534422
> ------------------------------------------------------------------------
> ------
>          LNE |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ------
>        lwald |  -.0000774   .0001056    -0.73   0.464    -.0002844
> .0001296
>
>
> **************************************
>
> auto example:
>
> Code:
>
> sysuse auto, clear
> gen double Price = price/1000
> gen double PRICE = price/1000000
> gen double z = length - 193
> sum price Price PRICE z
> rd price z, mbw(100)
> rd Price z, mbw(100)
> rd PRICE z, mbw(100)
>
> Output:
>
> . sum price Price PRICE z
>
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>        price |        74    6165.257    2949.496       3291      15906
>        Price |        74    6.165257    2.949496      3.291     15.906
>        PRICE |        74    .0061653    .0029495    .003291    .015906
>            z |        74   -5.067568    22.26634        -51         40
>
> . rd price z, mbw(100)
> Two variables specified; treatment is
> assumed to jump from zero to one at Z=0.
>
>  Assignment variable Z is z
>  Treatment variable X_T unspecified
>  Outcome variable y is price
>
> Estimating for bandwidth 24.98807626042474
> ------------------------------------------------------------------------
> ------
>        price |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ------
>        lwald |   -5198.13   2230.786    -2.33   0.020    -9570.391
> -825.8697
> ------------------------------------------------------------------------
> ------
>
> . rd Price z, mbw(100)
> Two variables specified; treatment is
> assumed to jump from zero to one at Z=0.
>
>  Assignment variable Z is z
>  Treatment variable X_T unspecified
>  Outcome variable y is Price
>
> Estimating for bandwidth 8.731619909031293
> ------------------------------------------------------------------------
> ------
>        Price |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ------
>        lwald |  -7.547781   2.493275    -3.03   0.002    -12.43451
> -2.661051
> ------------------------------------------------------------------------
> ------
>
> . rd PRICE z, mbw(100)
> Two variables specified; treatment is
> assumed to jump from zero to one at Z=0.
>
>  Assignment variable Z is z
>  Treatment variable X_T unspecified
>  Outcome variable y is PRICE
>
> insufficient observations
> r(2001);
>
> **************************************
>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of
>> Austin Nichols
>> Sent: 15 July 2011 15:23
>> To: [email protected]
>> Subject: Re: st: RD Optimal Bandwidth Algorithm sensitive to scaling?
>>
>> Chris--
>> I agree it is an undesirable "feature" of the optimal
>> bandwidth calculation, but some problem of this sort is
>> probably unavoidable--in this case it arises from estimating
>> local curvature using squared deviations of the outcome,
>> which is evidently sensitive to scale.
>> There are alternative approaches which would not face this
>> exact problem, but there would almost surely be other
>> problems, or other ways of breaking the estimator.  The
>> sensitivity of bandwidth to scale is particularly
>> undesirable, but also serves to illustrate what I have said
>> elsewhere: bandwidth selection is more art than science, and
>> at a minimum you should assess the sensitivity of your
>> estimates to bandwidth, which is why graphs for multiple
>> bandwidths are produced by default in -rd-, and there is an
>> option -bdep- to assess the dependence graphically.
>>
>> On Fri, Jul 15, 2011 at 9:45 AM, Stata Chris
>> <[email protected]> wrote:
>> > Dear list members,
>> >
>> > I am using Austin Nichols' -rd-
>> > (http://ideas.repec.org/c/boc/bocode/s456888.html) command,
>> as well as
>> > the related -rdob- by Fuji-Imbens-Kalyanaraman-Fuji
>> > (http://www.economics.harvard.edu/faculty/imbens/software_imbens)
>> >
>> > Now I've discovered that the optimal bandwidth chosen and hence the
>> > resulting estimates are sensitive to the scaling of the
>> outcome variable.
>> > To demonstrate this, I make use of an example discussed in this
>> > context in an earlier post:
>> >
>> > sysuse auto, clear
>> > gen Price = price/1000
>> > gen z = length - 193
>> > rd price z
>> > rd Price z
>> >
>> >
>> > As you can check, when I use as outcome the price in 1000 dollars
>> > ("Price") rather than in dollars ("price"), I get a different
>> > bandwidth and hence a very different estimate, whereas I
>> think I would
>> > wish to get the previous estimate just divided by 1000.
>> >
>> > This does not seem a very desirable property to me, but I'm
>> not sure
>> > where in the optimal bandwidth algorithm (see
>> > http://www.nber.org/papers/w14726 ) this comes from and whether it
>> > would be possible to avoid this. Probably some of you can say more
>> > about this?
>> >
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
>
> --
> Heriot-Watt University is a Scottish charity
> registered under charity number SC000278.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index