Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Standardized difference of means after PS matching

From   Adam Olszewski <>
Subject   Re: st: Standardized difference of means after PS matching
Date   Wed, 29 Aug 2012 17:25:50 -0400

Hi Ariel,
Thanks for the email. I'll try to clarify:

I run something like:
. psmatch2 treat age sex i.race ..., (I use radius matching BTW)
. xi: pbalchk treat age i.race ...., weight(_weight)

and get output of the kind:
1.race            SDM 0.21
2.race            SDM 0.18   (I'm making the numbers up but this is the range)

Then I try to run:
.psmatch2 treat age sex race ...,   (which treats "race" as a
continuous rather than factor variable)
.xi: pbalchk treat age i.race,  weight(_weight)

and get:
1.race           SDM 0.05
2.race           SDM 0.06

The change between "i.race" and "race" in the psmatch2 model is the
only change I made (I think I initially did it by mistake actually),
and was surprised to see the difference.
I tend not to use -pstest- or other tests that use sample-size
dependent p-values, following PC Austin recommendation. Though perhaps
I should take a look if -pstest- would report any change in bias

On Wed, Aug 29, 2012 at 5:12 PM, Ariel Linden. DrPH
<> wrote:
> Hi Adam,
> It is not clear to me what step in your process is giving you different
> results? In which program are you using c.race vs i.race? I am not sure that
> -pbalchk- (user written program by Mark Lunt) accepts the prefix -c.- and
> furthermore, I don't understand why you'd treat a multiple categorical
> variable (such as race) as continuous to begin with? You'd certainly end up
> with a result that would be meaningless.
> As far as calculating balance on a binary variable (assuming it is binary),
> your results should not differ much (between treating the variable as
> continuous and eliciting a proportion, or treating it as a count and using
> chi2) if you have sufficient sample sizes (see what happens when you compare
> chi2 with t-test for proportions)... However, if you have a multiple
> categorical variable, then I believe you'd need to create dummies for use in
> -pbalchk-
> In any case, I can't really provide more guidance, since I not sure exactly
> what is going on given the limited information you provided.
> Ariel
> Date: Wed, 29 Aug 2012 00:15:15 -0400
> From: Adam Olszewski <>
> Subject: st: Standardized difference of means after PS matching
> Hello,
> I noted something surprising today and I was wondering if any
> stata-listers have some insight into this issue.
> I have been using the (SSC derived user program) -psmatch2- for
> propensity score matching and (also SSC-derived) -pbalchk- for
> evaluation of balance after matching using standardized differences of
> means (SDM).
> I noted that the balance (sometimes quite dramatically!) improves if I
> replace in the PS logistic model categorical variables with a 'plain'
> non-factorized version (e.g. use "c.race" rather than "i.race" as a
> variable). This of course makes no sense as a rational "real-world"
> use of a variable, however the goal of PS analysis is to achieve the
> balance and not to make a sensible predictive model.
> I wonder however if this improvement could be an artifact of how the
> SDM's are calculated by the -pbalchk- command (of course I calculate
> categorized proportion differences using the -xi- syntax, ie.
> "i.race"). I would not like to exploit a mathematical quirk to have
> "better" (?) results, but on the other hand I can't find anything
> particularly conceptually wrong with it.
> Would any psmatch2/pbalchk users disagree?
> Best regards,
> Adam Olszewski
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index