Hi Ariel, Thanks for the email. I'll try to clarify: I run something like: . psmatch2 treat age sex i.race ..., (I use radius matching BTW) . xi: pbalchk treat age i.sex i.race ...., weight(_weight) and get output of the kind: 1.race SDM 0.21 2.race SDM 0.18 (I'm making the numbers up but this is the range) Then I try to run: .psmatch2 treat age sex race ..., (which treats "race" as a continuous rather than factor variable) .xi: pbalchk treat age i.sex i.race, weight(_weight) and get: 1.race SDM 0.05 2.race SDM 0.06 The change between "i.race" and "race" in the psmatch2 model is the only change I made (I think I initially did it by mistake actually), and was surprised to see the difference. I tend not to use -pstest- or other tests that use sample-size dependent p-values, following PC Austin recommendation. Though perhaps I should take a look if -pstest- would report any change in bias reduction. On Wed, Aug 29, 2012 at 5:12 PM, Ariel Linden. DrPH <ariel.linden@gmail.com> wrote: > Hi Adam, > > It is not clear to me what step in your process is giving you different > results? In which program are you using c.race vs i.race? I am not sure that > -pbalchk- (user written program by Mark Lunt) accepts the prefix -c.- and > furthermore, I don't understand why you'd treat a multiple categorical > variable (such as race) as continuous to begin with? You'd certainly end up > with a result that would be meaningless. > > As far as calculating balance on a binary variable (assuming it is binary), > your results should not differ much (between treating the variable as > continuous and eliciting a proportion, or treating it as a count and using > chi2) if you have sufficient sample sizes (see what happens when you compare > chi2 with t-test for proportions)... However, if you have a multiple > categorical variable, then I believe you'd need to create dummies for use in > -pbalchk- > > In any case, I can't really provide more guidance, since I not sure exactly > what is going on given the limited information you provided. > > Ariel > > Date: Wed, 29 Aug 2012 00:15:15 -0400 > From: Adam Olszewski <adam.olszewski@gmail.com> > Subject: st: Standardized difference of means after PS matching > > > > Hello, > I noted something surprising today and I was wondering if any > stata-listers have some insight into this issue. > I have been using the (SSC derived user program) -psmatch2- for > propensity score matching and (also SSC-derived) -pbalchk- for > evaluation of balance after matching using standardized differences of > means (SDM). > I noted that the balance (sometimes quite dramatically!) improves if I > replace in the PS logistic model categorical variables with a 'plain' > non-factorized version (e.g. use "c.race" rather than "i.race" as a > variable). This of course makes no sense as a rational "real-world" > use of a variable, however the goal of PS analysis is to achieve the > balance and not to make a sensible predictive model. > I wonder however if this improvement could be an artifact of how the > SDM's are calculated by the -pbalchk- command (of course I calculate > categorized proportion differences using the -xi- syntax, ie. > "i.race"). I would not like to exploit a mathematical quirk to have > "better" (?) results, but on the other hand I can't find anything > particularly conceptually wrong with it. > Would any psmatch2/pbalchk users disagree? > Best regards, > Adam Olszewski > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

