Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Ariel Linden, DrPH" <ariel.linden@gmail.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
reL Re: st: multiple imputation and propensity score |

Date |
Thu, 25 Aug 2011 13:40:43 -0400 |

I have a completely different take on this problem than has been discussed thus far in the thread. That doesn't mean I think these suggestions are wrong, but I would tackle the problem differently. First, my primary concern is hearing that Stefano generates multiple propensity scores for the same individual and that some of them " makes no sense, being virtually the same in patients treated with angioplasty or by-pass." This is problematic for two reasons: first, the propensity score is not intended to differentiate between treatment and control units, but instead find a common basis between them (e.g. on average, they should have similar baseline characteristics with the only difference between them being that some got treatment and some didn't). Second, I am not sure I agree with generating multiple propensity scores and then choosing which ones will represent the match. It is entirely possible under this scenario to generate completely different matches, with the characteristics being very different across matched groups. Those are my basic concerns. Now to solving the problem. One approach to dealing with missing values is to add a related variable describing its "missingness", and then use that in the propensity score estimation process. So for example, if we have a variable "gender" with some values missing, we'd generate another variable called "gender_miss", with a value of 1 if gender is missing and 0 if not. I can provide references where this approach is used. This solution could be problematic is there are too many missing values across many different variables, but that is perhaps beyond the scope of this discussion. I hope this helps Ariel Date: Wed, 24 Aug 2011 12:48:47 -0500 From: Stas Kolenikov <skolenik@gmail.com> Subject: Re: st: multiple imputation and propensity score On Wed, Aug 24, 2011 at 11:39 AM, Stefano Di Bartolomeo > In truth I am trying to be humble and apply the best methodology I can. I got tricked into this problem in 2 simple steps. First I read 'A Guide to Imputing Missing Data with Stata by Mark Lunt', which is a step by step guide for non-pundits like me. Throughout the guide a propensity score is the main goal of the examples. So I got the feeling that multiple imputation is good for propensity score and did that. Then, I reviewed the recent literature on propensity scores and it seems that matching is the technique that most reduces bias as compared to stratification on quintiles or inclusion of PS as covariate. And again, tried to follow the suggestion. Now I understand I have to give up one of the two techniques. I believe you could still see through your approach with both MI and PS. For that, you would need: 1. create multiple imputations using -ice- or the official -mi-. 2. write your own estimation program (say you named it -mi_ps_st-) that would 2a. run logistic regression as a matter of propensity score modeling 2b. generate propensity scores 2c. run your survival model 2d. Ideally, you'd want to correct the standard errors in the survival model for the fact that you have created some of the regressors. It is possible to do that in the linear regression context (see Hardin (2002, http://stata-journal.com/article.html?article=st0018), but I don't know if this approach is generalizable to -streg-. 3. run your -mi_ps_st- prefixed by -mim- (or, respectively, -mi estimate-) to combine the estimates and standard errors. Remember that MI only makes sense when you have the final parameter estimates and their standard errors. The intermediate results, like specific imputations, or observation-level averages across them, as you thought initially for your propensity scores, may not be very meaningful. The guide you referred to is dated, in the sense that Stata 12 incorporates MICE methodology in the official -mi-. The guide would still be applicable to Stata 11. I also did not like it relying on the author's written programs, although that is sometimes inevitable (I tend to trust the stuff that underwent some minimal checks at SJ or SSC a little bit better). BTW, I don't think it is at all possible to get the right standard errors from matching, so you would probably have to let that methodology go, anyway. So you would have to look into other options with your survival model. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: reshaping** - Next by Date:
**Re: st: reshaping** - Previous by thread:
**st: reshaping** - Next by thread:
**st: insufficient observations r(2001)** - Index(es):