Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Correcting for self selection

 From "Ariel Linden, DrPH" <[email protected]> To <[email protected]> Subject Re: st: Correcting for self selection Date Sat, 29 Jan 2011 11:01:32 -0800

```Using an offset is one reasonable approach. In essence, it turns the outcome
variable (number of wins - or probability of winning [Poisson or logistic,
respectively] into a rate (divided by the number of entries per
organization).

Another possible way to approach this is to match organizations on important
characteristics and then see which characteristics differentiate between
those organizations that win and those that don't. One of the important
variables to match on would be the "rate of submissions". In other words, we
can see how winners and losers are differentiated, after controlling for
rate of submission.

The organizations that compete more are obviously organized in a manner to
support this high rate. That is what you're probably interested in ferreting
out.

It is probably a good idea to test different approaches and see which is the
most robust and can stand up to critique.

I hope this helps

A

-Date: Fri, 28 Jan 2011 08:20:52 +0000 (GMT)
From: Maarten buis <[email protected]>
Subject: Re: st: Correcting for self selection

- --- On Fri, 28/1/11, [email protected] wrote:
> Based on an unbalanced panel data set of organizations
> competing with projects in a monthly competition, I try to
> model the likelihood of an organization winning (binary
> outcome) given explanatory variables. However, the
> organizations with the highest scores on the main
> explanatory variables (orgs a) participate more often in the
> contest than organizations with lower scores (orgs b).
>
> Since (orgs a) participate often, they enter many projects
> that lose and some projects that win. As a consequence, even
> though (orgs a) win the most in the contest overall, the
> models produce negative coefficients for these (orgs a)
> organizations.

Warning, this is just loose association on my part, so treat
this post as a suggestion on where you could look for a possible

What you want to do looks to me a lot like what people do when
they enter an offset to their -logit- or -poisson- model: some
units are longer or more often exposed to the risk than others.
I have never been in a position where I had data where I needed
to use this, so I only know it from reading about it, hence the
warning above. I have the impression this trick is more commonly
used in medical/biological fields, so maybe someone from these
fields can shed some light on this.

Hope this helps,
Maarten

- --------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```