Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2-


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2-
Date   Fri, 24 Feb 2012 15:23:28 -0500

Richard Herron <richard.c.herron@gmail.com> :
I think the advice given on matching within strata and then computing
the ATT in the -psmatch2- help file is bad advice.
[ http://fmwww.bc.edu/repec/bocode/p/psmatch2.html accessed today ]

Instead compute your own propensity scores within stratum, then find
matches within stratum for the whole sample by constructing a new
score which is stratum id times 100 plus propensity score, and let
-psmatch2- tell you the ATT for the whole sample after matching on
that new score.  Or better, use the propensity scores to reweight, or
use a double robust model.  Either way, I think you should worry about
clustered errors.  I hope you have more than 12 strata though...

webuse nlswork, clear
egen g = group(ind_code)
levels g, local(gr)
loc x collgrad union tenure age race
loc t msp
loc y ln_wage
qui reg `y' `t' `x', cl(g)
keep if e(sample)
g ps=.
qui foreach j of local gr {
 logit `t' `x' if (g ==`j')
 predict ps`j' if e(sample)
 replace ps=ps`j' if e(sample)
 }
g r=ps+g*100
psmatch2 `t', out(`y') pscore(r)
* mean comparison using matches
reg `y' `t'  [pw=_weight], nohe
* mean comparison with CRSE
reg `y' `t'  [pw=_weight], cl(g) nohe
egen m=mean(`t'), by(g)
g w=cond(`t',m/(1-m),ps/(1-ps))
ta `t'
ta `t' [aw=w]
ta g `t' , row nofr
ta g `t' [aw=w], row nofr
* mean comparison using PS weights
reg `y' `t'  [pw=w], nohe
* mean comparison with CRSE
reg `y' `t'  [pw=w], cl(g) nohe
* double robust
reg `y' `t' `x' [pw=w], nohe
* double robust with CRSE
reg `y' `t' `x' [pw=w], cl(g) nohe


On Fri, Feb 24, 2012 at 12:48 PM, Richard Herron
<richard.c.herron@gmail.com> wrote:
> I am using -psmatch2- (SSC) to match on propensity score and would
> like to limit matches to the same strata. The example code in
> -psmatch2-'s help file iterates over strata with a -foreach- loop and
> -if- statements and returns -att- (average treatment of the treated)
> for each strata, which the code stores for every observation in each
> strata (I could also limit this to actually matched treatment and
> control). Then the example code finds the mean -att- with -summarize-.
> I would like to make inferences on the mean -att-, however there are
> many observations, so standard errors are small and t-statistics
> large. If I use one -att- observation per strata (similar to a
> Fama-MacBeth regression), then I get completely different inferences.
>
> Is there a better way to limit matches within strata? I have tried the
> -mahalanobis- option using the whole data set, but the data are very
> large and it takes quite a long time (more than overnight to perform
> matching on distance in the whole data set).
>
> Or is the correct answer to avoid these inference problems to _not_
> aggregate my -att- over strata and evaluate them strata by strata?
> Thanks! I provide an example below.
>
> In my case the strata are fiscal years, but in the readily available
> data the strata are industry codes.
>
> * begin code
>
> * ssc install psmatch2
> webuse nlswork, clear
> generate att = .
> egen g = group(ind_code)
> levels g, local(gr)
> quietly foreach j of local gr {
>        psmatch2 msp collgrad union tenure age race ///
>            if (g == `j'), out(ln_wage)
>        replace att = r(att) if  (g == `j')
> }
>
> * but -psmatch2- assigns same -att- to every matched observed in each
> group -g- (here grouped on -ind_code-)
> table g att
>
> * so the standard errors are very small and the t-stats too large
> summarize att
> scalar t = `r(mean)' / (`r(sd)' / (`r(N)' - 1))
> scalar list t
>
> * is there a more sensible way to get the correct inference, while
> still forcing matches only within a given strata?
> * one thought is to collapse on -g-
> collapse (mean) att, by(g)
> summarize att
> scalar t = `r(mean)' / (`r(sd)' / (`r(N)' - 1))
> scalar list t
>
> * here the inference flips from negative significant to positive
> significant, but I haven't done any weighting by group
>
> * end code

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index