Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

re: st: propensity matching and matched pair analysis

From   "Ariel Linden. DrPH" <>
To   <>
Subject   re: st: propensity matching and matched pair analysis
Date   Sat, 18 Aug 2012 09:40:37 -0700

Hi Claude,

First off, as part of the Statalist requirements, listers are asked to note
what program they are using and where it is found. -psmatch2- is a user
written program and can be found on ssc (findit psmatch2).

In regards to your query, you have a couple different things going on here.
First, there is still an ongoing debate as to whether you need to run
non-parametric statistics after matching. This issue is even less clear when
you use 1:k matches, since matched individuals will likely differ on some
covariates even though they appear to be balanced at the cohort level (so on
average the groups will be comparable, but any two matched individuals may
not be). This is exactly what happens in an RCT - the random assignment
ensures balance on covariates at the aggregate level, but not necessarily
between any two people. 

What this means is that you can use parametric statistics if they are more
suitable to answer your particular research question. In fact, there is even
an ongoing debate as to whether the researcher should weight the
observations in 1:k matching when a fixed ratio matching is applied (which
you did using a fixed 2 controls for every 1 treated). Had you used variable
matching, weighting would have been necessary.

The next issue you ask about is identifying the specific treated and matched
controls. As per the help file, several new variables are generated:

        _id In the case of one-to-one and nearest-neighbors matching, a new
        identifier created for all observations.

        _nk In the case of one-to-one and nearest-neighbors matching, for
        treatment observation, it stores the observation number of the k-th
        matched control observation. Do not forget to sort by _id if you
        to use the observation number (id) of for example the 1st nearest
        neighbor as in

        sort _id
        g x_of_match = x[_n1]

        _nn In the case of nearest-neighbors matching, for every treatment
        observation, it stores the number of matched control observations.

Thus, you can order the matches so that they fall into groups.

I hope this helps


Date: Fri, 17 Aug 2012 20:09:21 +0000
From: Claude Beaty <>
Subject: st: propensity matching and matched pair analysis


I have utilized the psmatch2 command to successfully create a 2:1 nearest
neighbor match in my data based upon the presence of a specific disease.
This has resulted in 630 cases and 1200 controls. I can identify the matches
by the unique ID numbers created during the match process. In order to
accurately analyze this data, I know that I need to run McNemar's test for
categorical variables and the sign rank test for continuous variables to
account for the pairing. However, currently the data are listed as 1800
different observations. Is there an easy was to group the cases with their
controls, or due I have to manually create 630 groups by and then sum the
results from each group to run these analyses?

Claude A. Beaty Jr., M.D.
Halsted Surgical Resident
Cardiac Surgery Research Fellow
The Johns Hopkins Hospital

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index