Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2-

From	Richard Herron <[email protected]>
To	[email protected]
Subject	Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2-
Date	Fri, 24 Feb 2012 18:49:17 -0500

Thanks for the lesson, Austin! I will have to dig back into the books
to a better grasp of your odds ratio weighting in the -w = cond()-
statement.

And thanks for the stratum/strata lesson, Nick.

On Fri, Feb 24, 2012 at 15:23, Austin Nichols <[email protected]> wrote:
> Richard Herron <[email protected]> :
> I think the advice given on matching within strata and then computing
> the ATT in the -psmatch2- help file is bad advice.
> [ http://fmwww.bc.edu/repec/bocode/p/psmatch2.html accessed today ]
>
> Instead compute your own propensity scores within stratum, then find
> matches within stratum for the whole sample by constructing a new
> score which is stratum id times 100 plus propensity score, and let
> -psmatch2- tell you the ATT for the whole sample after matching on
> that new score.  Or better, use the propensity scores to reweight, or
> use a double robust model.  Either way, I think you should worry about
> clustered errors.  I hope you have more than 12 strata though...
>
> webuse nlswork, clear
> egen g = group(ind_code)
> levels g, local(gr)
> loc x collgrad union tenure age race
> loc t msp
> loc y ln_wage
> qui reg `y' `t' `x', cl(g)
> keep if e(sample)
> g ps=.
> qui foreach j of local gr {
>  logit `t' `x' if (g ==`j')
>  predict ps`j' if e(sample)
>  replace ps=ps`j' if e(sample)
>  }
> g r=ps+g*100
> psmatch2 `t', out(`y') pscore(r)
> * mean comparison using matches
> reg `y' `t'  [pw=_weight], nohe
> * mean comparison with CRSE
> reg `y' `t'  [pw=_weight], cl(g) nohe
> egen m=mean(`t'), by(g)
> g w=cond(`t',m/(1-m),ps/(1-ps))
> ta `t'
> ta `t' [aw=w]
> ta g `t' , row nofr
> ta g `t' [aw=w], row nofr
> * mean comparison using PS weights
> reg `y' `t'  [pw=w], nohe
> * mean comparison with CRSE
> reg `y' `t'  [pw=w], cl(g) nohe
> * double robust
> reg `y' `t' `x' [pw=w], nohe
> * double robust with CRSE
> reg `y' `t' `x' [pw=w], cl(g) nohe
>
>
> On Fri, Feb 24, 2012 at 12:48 PM, Richard Herron
> <[email protected]> wrote:
>> I am using -psmatch2- (SSC) to match on propensity score and would
>> like to limit matches to the same strata. The example code in
>> -psmatch2-'s help file iterates over strata with a -foreach- loop and
>> -if- statements and returns -att- (average treatment of the treated)
>> for each strata, which the code stores for every observation in each
>> strata (I could also limit this to actually matched treatment and
>> control). Then the example code finds the mean -att- with -summarize-.
>> I would like to make inferences on the mean -att-, however there are
>> many observations, so standard errors are small and t-statistics
>> large. If I use one -att- observation per strata (similar to a
>> Fama-MacBeth regression), then I get completely different inferences.
>>
>> Is there a better way to limit matches within strata? I have tried the
>> -mahalanobis- option using the whole data set, but the data are very
>> large and it takes quite a long time (more than overnight to perform
>> matching on distance in the whole data set).
>>
>> Or is the correct answer to avoid these inference problems to _not_
>> aggregate my -att- over strata and evaluate them strata by strata?
>> Thanks! I provide an example below.
>>
>> In my case the strata are fiscal years, but in the readily available
>> data the strata are industry codes.
>>
>> * begin code
>>
>> * ssc install psmatch2
>> webuse nlswork, clear
>> generate att = .
>> egen g = group(ind_code)
>> levels g, local(gr)
>> quietly foreach j of local gr {
>>        psmatch2 msp collgrad union tenure age race ///
>>            if (g == `j'), out(ln_wage)
>>        replace att = r(att) if  (g == `j')
>> }
>>
>> * but -psmatch2- assigns same -att- to every matched observed in each
>> group -g- (here grouped on -ind_code-)
>> table g att
>>
>> * so the standard errors are very small and the t-stats too large
>> summarize att
>> scalar t = `r(mean)' / (`r(sd)' / (`r(N)' - 1))
>> scalar list t
>>
>> * is there a more sensible way to get the correct inference, while
>> still forcing matches only within a given strata?
>> * one thought is to collapse on -g-
>> collapse (mean) att, by(g)
>> summarize att
>> scalar t = `r(mean)' / (`r(sd)' / (`r(N)' - 1))
>> scalar list t
>>
>> * here the inference flips from negative significant to positive
>> significant, but I haven't done any weighting by group
>>
>> * end code
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Obtaining correct inference from within strata -att- estimation with -psmatch2-
  - From: Richard Herron <[email protected]>
- Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2-
  - From: Austin Nichols <[email protected]>

Prev by Date: st: RE: Using ivhettest to test for heterogeneity
Next by Date: st: Stata procedure for RD density test
Previous by thread: Re: st: Obtaining correct inference from within strata -att- estimation with -psmatch2-
Next by thread: re: st: RE: Panel data: large number of linear time trends
Index(es):
- Date
- Thread