[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Gao Liu" <gao.liu@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: One to N matching |

Date |
Tue, 18 Nov 2008 14:13:51 -0500 |

Thanks, Austin, I'll check out your do file. Thanks Best Gao On Tue, Nov 18, 2008 at 2:08 PM, Austin Nichols <austinnichols@gmail.com> wrote: > Gao Liu-- > Note that -vmatch- does not select k nearest neighbors without > replacement, though it will find all matches within caliper (i.e. all > obs j with p_j no more than r away from a particular observation's > value p_i), which is not guaranteed to get you any closer to the goal. > -psmatch2- will select k nearest neighbors, but only with > replacement, and it only saves the identifier of the first matched > observation. Probably the best thing for you to do is to clone > -psmatch2- into a new file as -mypsmatch2- and modify the Mata code to > save additional identifiers in a Stata matrix. But you can also loop > over observations and match the hard way. It is unclear to me why you > would ever want to do this; matching k obs without replacement makes > the calculation of standard errors much harder, and the bootstrap is > not an option with matching. And you have to decide what to do about > ties... > > Here's a quickly cobbled together version of matching by hand by > looping over observations; no warranty expressed or implied that it > will be appropriate or easy to adapt for your application... > > use http://pped.org/stata/card, clear > g case=educ>16 > qui logit case exper* smsa south > predict p > set seed 123 > g double u=uniform() > sort case u > g _id=_n > g z=case > loc n=4 > forv j=1/`n' { > g match`j'=. > g p`j'=. > } > count if case==1 > forv i=1/`r(N)' { > g diff=abs(p-p[`i']) > sort z diff > qui forv j=1/`n' { > loc match`j'=_id[`j'] > loc p`j'=p[`j'] > replace z=. in `j' > } > drop diff > sort _id > qui forv j=1/`n' { > replace match`j'=`match`j'' in `i' > replace p`j'=`p`j'' in `i' > } > } > li _id p* match* in 1/15, noo clean > li _id p* match* in 351/360, noo clean > > Let me just repeat--I think this is a bad idea, in the sense that I > cannot think of a reason to do this as opposed to using -psmatch2- or > -nnmatch- (also on SSC) or reweighting. See also > http://pped.org/stata/erratum.pdf on reweighting. > > > On Tue, Nov 18, 2008 at 1:24 PM, Richard Goldstein > <richgold@ix.netcom.com> wrote: >> -vmatch- does provide 1 to N because it finds all matches for each case; in >> a recent match that I did I found anywhere from 1 to 18 matches for each of >> my cases >> >> Rich >> >> Gao Liu wrote: >>> >>> Thank you, Richard and Autstin, >>> >>> The score in my dataset is actually from psmatch2. Although psmatch2 >>> provides 1 to n matching, but it does not indicate which non-case >>> observations are matched to the case obs, except for the nearest one. >>> Also it allows duplicated matching. That is why I plan to do it by >>> hand. >>> >>> I just checked the command, vmatch. I did not find that it provdes the >>> option of 1 to N matching. >>> >>> Austin, can you decribe some more how to do it by hand. >>> >>> best >>> >>> Gao >>> >>> On Tue, Nov 18, 2008 at 11:32 AM, Austin Nichols >>> <austinnichols@gmail.com> wrote: >>>> >>>> Gao Liu-- >>>> Actually, I think you are looking for -psmatch2- (findit psmatch2). >>>> Or did you want to program the matching by hand? That is also >>>> possible, and not very hard in the case where all you want is the >>>> nearest N matches. However, note that the order of matching will >>>> matter in the situation you describe--matching without replacement--so >>>> you should probably do the matching many times and compute statistics >>>> using the rules of variance computation for multiple imputation. >>>> >>>> On Tue, Nov 18, 2008 at 11:07 AM, Richard Goldstein >>>> <richgold@ix.netcom.com> wrote: >>>>> >>>>> for already existing programs, rather than writing your own, I would >>>>> start >>>>> with -vmatch- (user-written, type -findit vmatch-) >>>>> >>>>> I'm not sure it will cover your last criterion (used only once) but if >>>>> not >>>>> it should be easy to eliminate those >>>>> >>>>> Rich >>>>> >>>>> Gao Liu wrote: >>>>>> >>>>>> Dear Statlist: >>>>>> >>>>>> I have a question about one to N matching. >>>>>> >>>>>> I have a dataset containing three variables: id, score, case, where >>>>>> case is a dummy variable indicating whether or not the observation is >>>>>> in the case group. How can I match each case observation to N non-case >>>>>> observation based on the score? Each case observation matches to the >>>>>> N non-case observations with the closest scores, but no case >>>>>> observation can match the same observation (i.e. the non-case >>>>>> observation can be used only one time). >>>>>> >>>>>> Thank you >>>>>> >>>>>> Best >>>>>> >>>>>> Gao > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: One to N matching***From:*"Gao Liu" <gao.liu@gmail.com>

**Re: st: One to N matching***From:*Richard Goldstein <richgold@ix.netcom.com>

**Re: st: One to N matching***From:*"Austin Nichols" <austinnichols@gmail.com>

**Re: st: One to N matching***From:*"Gao Liu" <gao.liu@gmail.com>

**Re: st: One to N matching***From:*Richard Goldstein <richgold@ix.netcom.com>

**Re: st: One to N matching***From:*"Austin Nichols" <austinnichols@gmail.com>

- Prev by Date:
**Re: st: One to N matching** - Next by Date:
**Re: st: -graph twoway- and x-axis positioning** - Previous by thread:
**Re: st: One to N matching** - Next by thread:
**Re: st: One to N matching** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |