Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: propensity matching and matched pair analysis


From   Claude Beaty <cbeaty1@jhmi.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: propensity matching and matched pair analysis
Date   Tue, 21 Aug 2012 13:46:08 +0000

All,

First to Ariel, I apologize for the delay in responding to your email. I appreciate your opinion on the use of parametric statistics and would love to avoid this whole issue based on these reasons, but the article I am submitting based on this data is subject to peer review and the reviewers would like to accommodate for the pairing in my statistics (I initially utilized the t-test, fisher's exact and chi2 analysis in my paired data to describe the covariate relationships based on disease presence). 

As for my questions: 
1) Can McNemar's test be used in a 2:1 analysis or only in a 1:1 analysis?  If this is the wrong test, what would be a better analysis for a 2:1 match (Mantel-Haenszel etc)? 

Assuming this is the right test, I realize that these case-controls need to be matched and analyzed as a group for the McNemar's and sign rank tests. I also know the groupings, as you indicated and I previously mentioned, based on the _id and _nk variables created by the -psmatch2- command. Currently, my data set is 1800 observations (rows) of patient information. I can create a grouping variable with the 600 groups (3 patients per group as per the 2:1 matching) by hand but would prefer not to. 

2) Is there an easy way to create these groups or do I have to write out 600 lines of code, as each group will have 3 different _id codes associated with it?

Finally, my understanding of the -mcc- command for McNemar's test is that the individual numbers associated with the various boxes of the  concordant table are to be entered by the user, not calculated by the program. I believe this means I need to manually interrogate every grouping by exposure to a variable of interest and then sum these results by hand, to enter them into a final McNemar's concordant table describing all of the groups relationships based on this one variable. This process would then have to be repeated for every variable of interest. As I have 600 groups and >40 variables of interest, this could prove to be prohibitively tedious.

3) Is this understanding of the command correct? If so, is there a way in which the program can calculate one discordant table for all groups but based on individual intra-group interactions?

Thank you.

Claude A. Beaty Jr., M.D.
Halsted Surgical Resident
Cardiac Surgery Research Fellow
The Johns Hopkins Hospital


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Ariel Linden. DrPH
Sent: Saturday, August 18, 2012 12:41 PM
To: statalist@hsphsun2.harvard.edu
Subject: re: st: propensity matching and matched pair analysis

Hi Claude,

First off, as part of the Statalist requirements, listers are asked to note what program they are using and where it is found. -psmatch2- is a user written program and can be found on ssc (findit psmatch2).

In regards to your query, you have a couple different things going on here.
First, there is still an ongoing debate as to whether you need to run non-parametric statistics after matching. This issue is even less clear when you use 1:k matches, since matched individuals will likely differ on some covariates even though they appear to be balanced at the cohort level (so on average the groups will be comparable, but any two matched individuals may not be). This is exactly what happens in an RCT - the random assignment ensures balance on covariates at the aggregate level, but not necessarily between any two people. 

What this means is that you can use parametric statistics if they are more suitable to answer your particular research question. In fact, there is even an ongoing debate as to whether the researcher should weight the observations in 1:k matching when a fixed ratio matching is applied (which you did using a fixed 2 controls for every 1 treated). Had you used variable matching, weighting would have been necessary.

The next issue you ask about is identifying the specific treated and matched controls. As per the help file, several new variables are generated:

        _id In the case of one-to-one and nearest-neighbors matching, a new
        identifier created for all observations.

        _nk In the case of one-to-one and nearest-neighbors matching, for every
        treatment observation, it stores the observation number of the k-th
        matched control observation. Do not forget to sort by _id if you want
        to use the observation number (id) of for example the 1st nearest
        neighbor as in

        sort _id
        g x_of_match = x[_n1]

        _nn In the case of nearest-neighbors matching, for every treatment
        observation, it stores the number of matched control observations.

Thus, you can order the matches so that they fall into groups.

I hope this helps

Ariel



Date: Fri, 17 Aug 2012 20:09:21 +0000
From: Claude Beaty <cbeaty1@jhmi.edu>
Subject: st: propensity matching and matched pair analysis

All,

I have utilized the psmatch2 command to successfully create a 2:1 nearest neighbor match in my data based upon the presence of a specific disease.
This has resulted in 630 cases and 1200 controls. I can identify the matches by the unique ID numbers created during the match process. In order to accurately analyze this data, I know that I need to run McNemar's test for categorical variables and the sign rank test for continuous variables to account for the pairing. However, currently the data are listed as 1800 different observations. Is there an easy was to group the cases with their controls, or due I have to manually create 630 groups by and then sum the results from each group to run these analyses?

Claude A. Beaty Jr., M.D.
Halsted Surgical Resident
Cardiac Surgery Research Fellow
The Johns Hopkins Hospital



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index