Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: pweights, propensity scores

 From Paul To statalist@hsphsun2.harvard.edu Subject Re: st: pweights, propensity scores Date Fri, 18 Jan 2013 13:02:45 -0500

```Thanks for your response Ariel. You're quite right, that article
focuses on a doubly robust estimator, and I'm quite familiar with that
concept.  However, I was referring to the IPTW estimator in that
article in section 2.2, which is not the doubly robust estimator.

I've solved the first part of my problem, and, in case anyone else
ever has a similar issue, I'll outline my partial solution below.

A simple regression of y on a dummy for treatment gives

beta hat=1/N_1 sum T*y - 1/N_2 sum (1-T)*y

Where N_1 is the number treated, N_2 is number untreated, T is dummy
for treatment.  When what I want is

beta _iptw=1/N sum T*y/p(x) - 1/N sum (1-T)*y/(1-p(x))

A simple but crude solution then is to multiply y by N_1/(N*p(x)) for
treated and N_2/(N*p(x)) for untreated. Now I have to figure out how
to get standard errors!

On Fri, Jan 18, 2013 at 10:53 AM, Ariel Linden, DrPH
<ariel.linden@gmail.com> wrote:
> Hi Paul,
>
> The Stata article you are referring to discusses the "doubly robust" approach to propensity-score weighted regression (http://www.stata-journal.com/article.html?article=st0149)
>
> At the most general level, this approach includes the same covariates in both the propensity score estimation model and outcome model with the idea that the investigator will have 2 chances to get the right answer, ie., if the propensity score model is misspecified, there is still the likelihood that the outcome model will be correctly specified (and vice-versa).
>
> In your first regression model below, you do not include covariates. Thus, you should not expect to get the exact same result as when you include the covariates which make the model "doubly robust".
>
> It is not clear to me why you are specifying additional weights? The original logic for the doubly robust approach (see Robins et al, 1995, and then discussed by Lunceford and Davidian 2004), uses the IPTW weight to weight each outcome. From your code below, it seems to me that you are specifying a different weight for each outcome?
>
> Ariel
>
> References:
>
> Lunceford, J. K., and M. Davidian. 2004. Stratification and weighting via the propensity score in estimation of causal treatment effects: A
> comparative study. Statistics in Medicine 23: 2937-2960.
>
> Robins, J. M., A. Rotnitzky, and L. P. Zhao. 1995. Analysis of semiparametric regression models for repeated outcomes in the presence of missing
> data. Journal of the American Statistical Association 90: 106-121.
>
>
> Date: Thu, 17 Jan 2013 16:14:52 -0500
> From: Paul <paulburk314@gmail.com>
> Subject: st: pweights, propensity scores
>
> Hi all,
>
> I'm using propensity scores to estimate treatment effects, where
> treatment is exogenous conditional on the propensity score. I'm using
> an estimator from Wooldridge's 2010 text book, which is also discussed
> in The Stata Journal (2008) 8, Number 3, pp. 334â€“353.
>
> Specifically, the treatment effect is estimated using (1/N) sum
> (T*Y/p) - (1/N) sum ((1-T)*Y/(1-p).
>
> According to the Stata Journal article, this can be estimated using a
> regression with pweights equal to the "inverse of the treatment
> probability deï¬ ned using the
> propensity score." However, when I use just the sum of the weighted
> variables, I get a different answer from the regression result.  I'm
> not terribly familiar with pweights, so I could be making some dumb
> mistake.
>
> Below is my code.  Does anyone know what I'm doing wrong, or what the
> correct way to implement this method is?
>
> Thanks,
> Paul
>
> /* Regression using pweights */
> gen ipw=1/p_x if treated==1
> replace ipw=1/(1-p_x) if treated==0
>
> reg y treated [pweight=ipw]
>
> /* IPTW one variable */
> gen w1=((treated-p_x)/(p_x*(1-p_x)))
> gen w1_y=w1 *y
>
> sum w1_y
>
> /* IPTW two variables */
> gen w2a_y=y*treated/p_x
> gen w2b_y=y*(1-treated)/(1-p_x)
>
> foreach type in a b{
>   sum w2`type'_y
>   local mean_w2`type' =r(mean)
> }
>
> di `mean_w2a'-`mean_w2b'
>
> /* IPTW two variables weights sum to one */
> bysort treated: egen w_ipw=total(ipw)
> gen w3a_y=(1/w_ipw)*y*treated/p_x
> gen w3b_y=(1/w_ipw)*y*(1-treated)/(1-p_x)
>
> foreach type in a b{
>   sum w3`type'_y
>   local mean_w3`type' =r(mean)
> }
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```