Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Richard Herron <richard.c.herron@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2- |

Date |
Fri, 24 Feb 2012 18:49:17 -0500 |

Thanks for the lesson, Austin! I will have to dig back into the books to a better grasp of your odds ratio weighting in the -w = cond()- statement. And thanks for the stratum/strata lesson, Nick. On Fri, Feb 24, 2012 at 15:23, Austin Nichols <austinnichols@gmail.com> wrote: > Richard Herron <richard.c.herron@gmail.com> : > I think the advice given on matching within strata and then computing > the ATT in the -psmatch2- help file is bad advice. > [ http://fmwww.bc.edu/repec/bocode/p/psmatch2.html accessed today ] > > Instead compute your own propensity scores within stratum, then find > matches within stratum for the whole sample by constructing a new > score which is stratum id times 100 plus propensity score, and let > -psmatch2- tell you the ATT for the whole sample after matching on > that new score. Or better, use the propensity scores to reweight, or > use a double robust model. Either way, I think you should worry about > clustered errors. I hope you have more than 12 strata though... > > webuse nlswork, clear > egen g = group(ind_code) > levels g, local(gr) > loc x collgrad union tenure age race > loc t msp > loc y ln_wage > qui reg `y' `t' `x', cl(g) > keep if e(sample) > g ps=. > qui foreach j of local gr { > logit `t' `x' if (g ==`j') > predict ps`j' if e(sample) > replace ps=ps`j' if e(sample) > } > g r=ps+g*100 > psmatch2 `t', out(`y') pscore(r) > * mean comparison using matches > reg `y' `t' [pw=_weight], nohe > * mean comparison with CRSE > reg `y' `t' [pw=_weight], cl(g) nohe > egen m=mean(`t'), by(g) > g w=cond(`t',m/(1-m),ps/(1-ps)) > ta `t' > ta `t' [aw=w] > ta g `t' , row nofr > ta g `t' [aw=w], row nofr > * mean comparison using PS weights > reg `y' `t' [pw=w], nohe > * mean comparison with CRSE > reg `y' `t' [pw=w], cl(g) nohe > * double robust > reg `y' `t' `x' [pw=w], nohe > * double robust with CRSE > reg `y' `t' `x' [pw=w], cl(g) nohe > > > On Fri, Feb 24, 2012 at 12:48 PM, Richard Herron > <richard.c.herron@gmail.com> wrote: >> I am using -psmatch2- (SSC) to match on propensity score and would >> like to limit matches to the same strata. The example code in >> -psmatch2-'s help file iterates over strata with a -foreach- loop and >> -if- statements and returns -att- (average treatment of the treated) >> for each strata, which the code stores for every observation in each >> strata (I could also limit this to actually matched treatment and >> control). Then the example code finds the mean -att- with -summarize-. >> I would like to make inferences on the mean -att-, however there are >> many observations, so standard errors are small and t-statistics >> large. If I use one -att- observation per strata (similar to a >> Fama-MacBeth regression), then I get completely different inferences. >> >> Is there a better way to limit matches within strata? I have tried the >> -mahalanobis- option using the whole data set, but the data are very >> large and it takes quite a long time (more than overnight to perform >> matching on distance in the whole data set). >> >> Or is the correct answer to avoid these inference problems to _not_ >> aggregate my -att- over strata and evaluate them strata by strata? >> Thanks! I provide an example below. >> >> In my case the strata are fiscal years, but in the readily available >> data the strata are industry codes. >> >> * begin code >> >> * ssc install psmatch2 >> webuse nlswork, clear >> generate att = . >> egen g = group(ind_code) >> levels g, local(gr) >> quietly foreach j of local gr { >> psmatch2 msp collgrad union tenure age race /// >> if (g == `j'), out(ln_wage) >> replace att = r(att) if (g == `j') >> } >> >> * but -psmatch2- assigns same -att- to every matched observed in each >> group -g- (here grouped on -ind_code-) >> table g att >> >> * so the standard errors are very small and the t-stats too large >> summarize att >> scalar t = `r(mean)' / (`r(sd)' / (`r(N)' - 1)) >> scalar list t >> >> * is there a more sensible way to get the correct inference, while >> still forcing matches only within a given strata? >> * one thought is to collapse on -g- >> collapse (mean) att, by(g) >> summarize att >> scalar t = `r(mean)' / (`r(sd)' / (`r(N)' - 1)) >> scalar list t >> >> * here the inference flips from negative significant to positive >> significant, but I haven't done any weighting by group >> >> * end code > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Obtaining correct inference from within strata -att- estimation with -psmatch2-***From:*Richard Herron <richard.c.herron@gmail.com>

**Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2-***From:*Austin Nichols <austinnichols@gmail.com>

- Prev by Date:
**st: RE: Using ivhettest to test for heterogeneity** - Next by Date:
**st: Stata procedure for RD density test** - Previous by thread:
**Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2-** - Next by thread:
**re: st: RE: Panel data: large number of linear time trends** - Index(es):