Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: propensity score (diff group sizes: treatment >>> control)

From   <>
To   <>
Subject   RE: st: propensity score (diff group sizes: treatment >>> control)
Date   Fri, 21 Feb 2014 11:54:51 +0000

Many thanks, very useful!


-----Original Message-----
From: [] On Behalf Of Ariel Linden
Sent: 19 February 2014 15:04
Subject: re: st: propensity score (diff group sizes: treatment >>> control)

First off, it is expected on the Statalist that you state which program you are using and where you got them from. In this case, I am assuming from the command line that you provided that you are using -pscore- and -psmatch2- (both are user written programs (findit pscore and findit psmatch2).

Second, you have only 40 controls to match against 500 treated. This basically leaves you with two choices: 

(1) match with replacement. The potential problem here is that you likely use some (or all) these controls so many times that I'd question the generalizability of the results (can one control really serve as a counterfactual for 100 treated individuals. It seems not to be a very good strategy). In any case, if you go this route you'll need to use a frequency weight to account for the number of times that each control was used.

(2) a perhaps more reasonable approach would be to flip the matching so that you're matching treated to controls. In other words, find the 40+ treated units that are most comparable on observed characteristics to those 40 controls. This will change your treatment effects estimator to ATC (average treatment effects on the controls).

In both cases above, I would suggest that you stick with a matching algorithm as opposed to kernel density matching. It will be easier to visually inspect the matches to see if it passes the "sniff test".

I hope this helps

Date: Tue, 18 Feb 2014 17:04:03 +0000
From: <>
Subject: st: propensity score (diff group sizes: treatment >>> control)

Dear list,

I am evaluating an intervention for with I have a control group (N=40) and a treatment group (N=500).
I am using propensity score matching to match the two groups by sociodemographics (age, gender, living status).
I am considering two methods and I would be delighted to receive advices from anybody having encountered the same problem.

1/ First method
I calculated the pscore, and then performed the PSM (Kernel method).

pscore Group Age Gender Living, pscore(myscore) blockid(myblock)

psmatch2 Group, outcome(GP_Times) pscore(myscore) kernel(normal)

2/ Second method
Due to the high difference in the observations and the fact that my treatment is now the more numerous, I tried to inverse the groups.
I created a new variable ('Group_opposite') with control group (N=500) and treatment group (N=40), then I calculated the new pscore, and finally I performed the PSM (Kernel method).

pscore Group Age Gender Living, pscore(myscore2) blockid(myblock2)

psmatch2 Group_opposite, outcome(GP_Times) pscore(myscore2) kernel(normal)

The values calculated using the second pscore seems to be more conservative that the first ones, and the results more acceptable.
Would be right to use the second method instead of the first one?

Looking forward for your advices, many thanks in advance,

Best wishes,


Valentina Iemmi | Research Officer
London School of Economics and Political Science | Personal Social Services Research Unit - PSSRU Houghton Street | London WC2A 2AE 

*   For searches and help try:

Please access the attached hyperlink for an important electronic communications disclaimer:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index