Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SV: st: Imbalance in control versus treated group, and weights


From   <Alexander.Severinsen@telenor.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   SV: st: Imbalance in control versus treated group, and weights
Date   Mon, 13 Oct 2008 08:44:02 +0200

Stas, thanks for this. I'll have a go at your idea.

Best wishes,
Alexander



-----Opprinnelig melding-----
Fra: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] På vegne av Stas Kolenikov
Sendt: 10. oktober 2008 00:15
Til: statalist@hsphsun2.harvard.edu
Emne: Re: st: Imbalance in control versus treated group, and weights

I am a design-based inference guy, I know too much of survey statistics and too little of anything else :)). So here are my two design-based cents.

If you had say 5000 people with z=1 all sampled, and out of 5000 remaining z=0 people, 3000 were sampled, I would just treat those as strata with differential probabilities of selection:
Pr[selection|z=1]=1, Pr[selection|z=0]=3/5, so the pweight to go along with the first group is 1, while the weight to go along with the second group is 5/3=1.667. That should actually be about the same reweighting idea that Austin suggested originally.

There is literature on an area that would seem to be related to your problem, the population-based case-control studies, that takes the problem to the extreme: it is the dependent variable itself that is used as a criteria for sampling. Usually this applies to rare diseases, when all the cases are taken into the data set (Prob[selection]=1, weight=1, and controls are sampled from population (Prob[selection] is a tiny number, weight = 1e5 or something like that). The interest is often in probability of having the disease conditional on some covariates, and miraculously enough you can estimate this model using maximum likeihood without weights -- the only parameter that will be biased is the intercept. Alastair Scott from New Zealand is the guy who knows all about it; see http://www.citeulike.org/user/ctacmo/article/1036969.

On 10/8/08, Alexander.Severinsen@telenor.com <Alexander.Severinsen@telenor.com> wrote:
> Thank you for the advice. Very helpful!
>
>  In this spesific case z is a dummy, and if z=1 then this will increase the likelihood of observing x=1. And yes, I do observe outcomes for the group that was supposed to be treated, but were not.
>
>  Best wishes,
>  Alexander
>
>  -----Opprinnelig melding-----
>  Fra: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] På vegne av Austin
> Nichols
>  Sendt: 8. oktober 2008 18:39
>  Til: statalist@hsphsun2.harvard.edu
>  Emne: Re: st: Imbalance in control versus treated group, and weights
>
>
>  It is possible that some kind of propensity score reweighting or regression discontinuity design would be appropriate here, but without much more information, it is hard to offer any specific advice.  How does z affect x in the group supposed to have x=1?  Do you observe outcomes for the group supposed to have x=1 but having x=0? Etc.
>
>  Running a probit with the assumption E(y)=F(b0+b1*x+b2*z) seems unlikely to recover a good estimate of the effect of x on y unless that assumption is actually true!
>
>  On Wed, Oct 8, 2008 at 12:23 PM,  <Alexander.Severinsen@telenor.com> wrote:
>  > Dear Statalisters,
>  >
>  > I have the following problem. I have given a sample of 10000 people as targets for receiving an offer, and I have a control group equal to 5000 people. I know that the potentially treated and the controlgroup is representative. However, without my knowledge only 8000 of the 10000 targets were treated, and a specific criteria was used to pick those 8000 from the 10000.
>  >
>  > This has created an imbalance between my controlgroup and those treated, and this imbalance is identified and only concerns one variable. I want to investigate whether the offer given could reduce the defection rate of customers, but the variable that created this imbalance is known to hugely impact the defection rate. To reduce this problem I would like to use weights in Stata, but I am unsure on how to approach this? Any tips would be greatly appreciated.
>  >
>  > Also, say that I did not correct for this, and did the following probit model with the following variables, y=defected/not defected, x=treated/control, z=factor that created imbalance:
>  >        y=b0+b1*x+b2*z
>  > would it be appropriate to say that it was possible to control for the imbalance by including it as a independent variable in this fashion?
>  >
>  > Best wishes,
>  > Alexander Severinsen
>  *
>  *   For searches and help try:
>  *   http://www.stata.com/help.cgi?search
>  *   http://www.stata.com/support/statalist/faq
>  *   http://www.ats.ucla.edu/stat/stata/
>
>  *
>  *   For searches and help try:
>  *   http://www.stata.com/help.cgi?search
>  *   http://www.stata.com/support/statalist/faq
>  *   http://www.ats.ucla.edu/stat/stata/
>


--
Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index