Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

re: Re: Re: st: Pretreatment pair matching

From   "Ariel Linden, DrPH" <>
To   <>
Subject   re: Re: Re: st: Pretreatment pair matching
Date   Thu, 4 Oct 2012 11:10:33 -0400

I honestly don't understand what you are trying to do here? I suggested
-cem- ( a user written program found on SSC) as an approach to matching.
Similarly, there are several other packages available that perform matching
(including a concurrent thread on the list " Matching samples in Stata" in
which David Kantor describes his user-written suite of programs, such as
-mahapick-). I suggest you do a bit of searching/reading up on the available
programs for what you are trying to accomplish. 

Date: Wed, 3 Oct 2012 14:13:20 +0200
From: Wojciech Hardy <>
Subject: Re: Re: st: Pretreatment pair matching

Thank you both for your answers. I've tried several approaches and
finally arrived at a set of variables containing Mahalanobis distances
between each pair from my dataset (using the mahascores program, from
the mahapick package).
I would now want STATA to find the optimal combination of pairs,
although once again I don't know how to do it. I know there are
algorithms that do this, but most of them start at the
treated/untreated level, and I don't know how to make them match from
a distance matrix provided by me.
Is there a clever way to do this with STATA?

Thanks indeed,
Wojciech Hardy

2012/9/21 Ariel Linden, DrPH <>:
> Another approach you might consider is using coarsened exact matching,
> -cem-, a user written program found on SSC (ssc install cem)
> Using -cem-, you would omit the treatment() option, which would cause the
> program to sort the observations into strata without matching criteria.
> See section 5.6 "Blocking in Randomized Experiments" in:
> Iacus, Stefano M., Gary King, and Giuseppe Porro. "Causal Inference
> Balance Checking: Coarsened Exact Matching." Political Analysis (2011).
> Ariel
> Date: Thu, 20 Sep 2012 12:46:28 -0400
> From: Austin Nichols <>
> Subject: Re: st: Pretreatment pair matching
> Wojciech Hardy <>:
> What variables do you want to match on to make pairs?
> If they are all categorical and a relatively small number, you might
>  egen class=group(x*)
> and randomize within class, or:
>  bys x*: g pair_id=ceil((_n-mod(_n-1,2))/2)
>  g u=uniform()
>  bys x* pair_id (u): g treatment=_n==1
> or somesuch.
> With some continuous variables, you may prefer to construct a
> Mahalanobis distance from group means, and then sort by that distance
> within group before assigning a pair id. See also -help cluster- and
> the related manual entries.
> You might also want to read "The Essential Role of Pair Matching in
> Cluster-Randomized Experiments, with Application to the Mexican
> Universal Health Insurance Evaluation" by Kosuke Imai, Gary King and
> Clayton Nall (2009):
> On Thu, Sep 20, 2012 at 6:44 AM, Wojciech Hardy <>
> wrote:
>> Hello all,
>> I'm trying to conduct a pair matching procedure only without having
>> the treatment group and control group set beforehand.
>> In fact we intend to use the pair matching to help us decide which
>> items should go to the treatment group (i.e. we'd like to find
>> 'twins', and then put one of them into the treatment group, and the
>> keep other one in the control group, thus increasing the efficiency of
>> later statystical analysis).
>> I've found some commands allowing for measuring the treatment effect,
>> by comparing twins, but these do not apply to my case and
>> unfortunately I don't have the skills to modify the commands to give
>> me what I need.
>> What I'd want is to match the observations into pairs (generate some
>> "pair ID" probably), based on specified variables, without doing
>> anything else.
>> I thought of solving this by duplicating the dataset, giving the new
>> copy a "1" value in a treatment variable, and making the already
>> created commands match between the two identical groups. I'd have to
>> make it not match the observations with the same ID however (i.e. with
>> their own copies), and I don't know how to do it. Also, I'm not sure
>> if these commands report what pairs they've made in the process.
>> I haven't found any solution to this on the net, so I'll be really
>> grateful for any help!

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index