Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Hierarchical Selection Model

From   Christopher R Berry <>
Subject   st: Hierarchical Selection Model
Date   Mon, 23 Jun 2003 12:28:00 -0400

I am using Stata 8.

I would like to know if anyone has advice on how to approach the following
model.  Iím looking at school board elections, where I have data on all school
board elections in one state.  I would like to test whether certain variables
improve a challengerís share of the vote relative to an incumbent.  There are
two stages of selection involved.  First, the incumbent decides whether or not
to run for re-election.  Second, if the incumbent runs, she may or may not face
a challenger.  For races in which the incumbent runs *and* faces a challenger,
we observe the challengerís vote share and the incumbentís share.  For what itís
worth, the incumbentís vote share is 100% whenever she runs and there is no
challenger; the challengerís vote share is 100% whenever there is no incumbent.
 But the interesting case is when there is both an incumbent and a challenger. 
Iíd like to model the challengerís vote share as a function of other attributes
of the district.  My question is how to estimate the selection model in Stata.

An additional wrinkle is that the vote share data are at the precinct level,
whereas the decision of the candidates to run is made at the district level.  In
other words, there are J districts with Pj precints in each district (Pj not
equal for all J).  When a candidate runs, she runs in every precinct in the
district.  So the two-stages of selection occur at the district level (i.e., are
identical for every precinct within a district), but the vote share varies by

So I could first estimate a probit model for incumbent running or not.  Then
generate the inverse Mills ratio (IMR) and include that in a second probit model
estimating whether or not the race is contested (restricting estimation to cases
where there is an incumbent running).  Both of these equations would be
estimated at the district level.  I take the IMR from the second equation and
include it in the final vote share equation, which would be estimated at the
precinct level (the IMR, which would have been estimated at the district level
in step 2, would then be identical across all precincts in a district in the
vote share model).  Estimation of the final equation would be restricted to
cases where both an incumbent and a challenger are running.  But how to
calculate standard errors appropriately in all the equations?  If anyone has
ideas about how to do this better, Iíd be much obliged.  


*   For searches and help try:

© Copyright 1996–2019 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index