Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: how to cluster with ivprobit with two-step option?


From   "Jennifer Leavy" <J.Leavy@ids.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: how to cluster with ivprobit with two-step option?
Date   Tue, 22 Aug 2006 16:22:25 +0100

Hi

Thanks so much for your input, both Mark and Stas. I wondered if I'd have to go back to the survey design to do this, using some variation on the -svy commands. We designed the survey, three villages were purposively sampled. In two of the villages random samples of households were taken, and where possible one adult male and one adult female surveyed in each household, although this was not always possible for a variety of reasons. For the third village we aimed for full enumeration, though we didn't quite achieve this, and again where possible one male and one female surveyed within each household. So the weights at that level are straightforward, and I did this with earlier analyses.

I'll put some work in and see what happens.

Jennifer

Jennifer Leavy

Research Officer

Vulnerability and Poverty Reduction Team

Tel: ++ 44 (0)1273 678747

Fax: ++ 44 (0)1273 621202

Information on the Rural Labour Markets project is available from our website:

http://www.ids.ac.uk/ids/pvty/pvrurallabour.html

From	   "Stas Kolenikov" <skolenik@gmail.com>	 
To	   statalist@hsphsun2.harvard.edu	 
Subject	   Re: st: RE: RE: how to cluster with ivprobit with two-step option?	 
Date	   Mon, 21 Aug 2006 12:25:47 -0500	

  _____  

	If the equation of interest is the outcome equation of the selection
	model, it isn't clear that you need to estimate explicitly the selection
	equation explicitly, i.e., 1a and 1b.  In other words, you're talking in
	terms of estimating a system of equations, but you may only need to
	worry about just the outcome equation.  If that's the case, then, for
	example, the excluded instruments you were thinking of using in (1a) and
	(1b) to instrument for the endogenous regressors in the selection
	equation could be used directly as the exclusion restrictions in the
	Heckman-type estimation in (2b).  Or, put another way, the probit first
	stage in the selection estimation can be a reduced form estimation with
	just the exogenous regressors.
	

That gives you proper identification, and reduces the number of
equations in the system from 4 to 3, but does not solve Jennifer's
problem of incorporating the complex sample structure into the
estimation procedure. If you can write this down as a system, with
scores/estimating equations/moment conditions implied by it, then the
problem of design-based estimation can be solved through
linearization/sandwich estimator of variance, but I don't think this
has ever been programmed... and that it is easy to program in the
first place. What Jennifer might think of instead is to use resampling
methods of variance estimation with -svy brr- and/or -svy jackknife-,
which however requires a lot of tuning of the weights and such.

With that said... the problem looks pretty hopeless at the moment.
What you can do as an ad-hoc plug-in rule is to run your program in
the most expanded form that allows for survey estimation, take design
effects from say -svy, deff: heckman- if that at all works (I've no
idea!) and use those DEFFs to augment the standard errors and tests in
your final model that will give proper point estimates, but will
understate your standard errors due to the complex survey design. You
would need to mske sure that the standard errors have been corrected
for extra estimation in the IV-first stage of the Heckman-second
stage... and that may not be trivial by itself.

Finally, note that the clustering due to complex survey design may
need to be taken at an earlier stage of sampling than households. Your
data provider should have included the description of the sample
design, and you need to worry to start correcting for clustering at
the level of the primary sampling units (PSUs), which may be something
like county or a postal code or something like that, with households
nested within those PSUs.

HTH.

--
Stas Kolenikov


 

______________________________________________________________________
This message is for the addressee only and may contain privileged or confidential information.  If you have received it in error, please notify the sender immediately and delete the original. Any views or opinions expressed are solely those of the author and do not necessarily represent those of IDS.

Institute of Development Studies
at the University of Sussex, Brighton BN1 9RE
Tel: +44 (0)1273 606261; Fax: +44 (0)1273 621202
IDS, a charitable company limited by guarantee:
Registered Charity No. 306371; Registered in England 877338;  VAT No. GB 350 899914

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index