Hi
Thanks so much for your input, both Mark and Stas. I wondered if I'd have to go back to the survey design to do this, using some variation on the -svy commands. We designed the survey, three villages were purposively sampled. In two of the villages random samples of households were taken, and where possible one adult male and one adult female surveyed in each household, although this was not always possible for a variety of reasons. For the third village we aimed for full enumeration, though we didn't quite achieve this, and again where possible one male and one female surveyed within each household. So the weights at that level are straightforward, and I did this with earlier analyses.
I'll put some work in and see what happens.
Jennifer
Jennifer Leavy
Research Officer
Vulnerability and Poverty Reduction Team
Tel: ++ 44 (0)1273 678747
Fax: ++ 44 (0)1273 621202
Information on the Rural Labour Markets project is available from our website:
http://www.ids.ac.uk/ids/pvty/pvrurallabour.html
From "Stas Kolenikov" <skolenik@gmail.com>
To statalist@hsphsun2.harvard.edu
Subject Re: st: RE: RE: how to cluster with ivprobit with two-step option?
Date Mon, 21 Aug 2006 12:25:47 -0500
_____
If the equation of interest is the outcome equation of the selection
model, it isn't clear that you need to estimate explicitly the selection
equation explicitly, i.e., 1a and 1b. In other words, you're talking in
terms of estimating a system of equations, but you may only need to
worry about just the outcome equation. If that's the case, then, for
example, the excluded instruments you were thinking of using in (1a) and
(1b) to instrument for the endogenous regressors in the selection
equation could be used directly as the exclusion restrictions in the
Heckman-type estimation in (2b). Or, put another way, the probit first
stage in the selection estimation can be a reduced form estimation with
just the exogenous regressors.
That gives you proper identification, and reduces the number of
equations in the system from 4 to 3, but does not solve Jennifer's
problem of incorporating the complex sample structure into the
estimation procedure. If you can write this down as a system, with
scores/estimating equations/moment conditions implied by it, then the
problem of design-based estimation can be solved through
linearization/sandwich estimator of variance, but I don't think this
has ever been programmed... and that it is easy to program in the
first place. What Jennifer might think of instead is to use resampling
methods of variance estimation with -svy brr- and/or -svy jackknife-,
which however requires a lot of tuning of the weights and such.
With that said... the problem looks pretty hopeless at the moment.
What you can do as an ad-hoc plug-in rule is to run your program in
the most expanded form that allows for survey estimation, take design
effects from say -svy, deff: heckman- if that at all works (I've no
idea!) and use those DEFFs to augment the standard errors and tests in
your final model that will give proper point estimates, but will
understate your standard errors due to the complex survey design. You
would need to mske sure that the standard errors have been corrected
for extra estimation in the IV-first stage of the Heckman-second
stage... and that may not be trivial by itself.
Finally, note that the clustering due to complex survey design may
need to be taken at an earlier stage of sampling than households. Your
data provider should have included the description of the sample
design, and you need to worry to start correcting for clustering at
the level of the primary sampling units (PSUs), which may be something
like county or a postal code or something like that, with households
nested within those PSUs.
HTH.
--
Stas Kolenikov
______________________________________________________________________
This message is for the addressee only and may contain privileged or confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any views or opinions expressed are solely those of the author and do not necessarily represent those of IDS.
Institute of Development Studies
at the University of Sussex, Brighton BN1 9RE
Tel: +44 (0)1273 606261; Fax: +44 (0)1273 621202
IDS, a charitable company limited by guarantee:
Registered Charity No. 306371; Registered in England 877338; VAT No. GB 350 899914
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/