Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !

From   Cameron McIntosh <>
Subject   RE: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !
Date   Mon, 14 Nov 2011 00:08:27 -0500

I strongly suggest that you take a look at the following (as for clustered observations and missing data, you haven't really described the dataset enough to be able to comment -- how many time points are there?):
Greene, W.H., & Hensher, D.A. (2010). Modeling Ordered Choices: A Primer. Cambridge, UK: Cambridge University Press.
Chesher, A., & Smolinski, K. (2012). IV models of ordered choice. Journal of Econometrics, 166(1), 33-48.
Carrasco, R. (2001). Binary Choice With Binary Endogenous Regressors in Panel Data. Journal of Businessand Economic Statistics, 19(4), 385-394.
Chesher, A., Rosen, A., & Smolinski, K. (February 11, 2011).  An instrumental variable model of multiple discrete choice. cemmap working paper CWP06/11. The Institute for Fiscal Studies Department of Economics, UCL.
Mullahy, J. (2001). [Estimation of Limited Dependent Variable Models with Dummy Endogenous Regressors: Simple Strategies for Empirical Practice]: Comment. Journal of Business & Economic Statistics, 19(1), 23-25. 
Altonji, J.G., & Matzkin, R.L. (2005). Cross Section And Panel Data Estimators For Nonseparable Models With Endogenous Regressors.  Econometrica, 73(4), 1053-1101.
Dong, Y., & Lewbel, A. (April 2011). Simple Estimators for Binary Choice Models With Endogenous Regressors.
Lewbel, A. (March 2011). Binary Choice With Endogenous Or Mismeasured Regressors.

> From:
> Date: Sun, 13 Nov 2011 15:41:46 +0100
> Subject: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !
> To:
> Dear Stata List,
> Dear Mark Schaffer (I guess ;-) )
> I have a econometric question related to endogenous variables and
> panel data, and I believe that it can be interesting for anyone who
> uses longitudinal data.
> Here's the context :
> I have a panel dataset of individuals who, at any time t, could
> endogenously chose the value of a variable E (for endogenous). E is
> not ordered and could take few values (in my case, 6 possible
> choices).
> I am particularly interested in the effect of one of these choices on
> a fully continuous outcome variable Y.
> That is, at any time and for any individual I would like to estimate
> Yit=a+bXit+cZit+eit
> where for example, Z is a binary variable that is equals to 1 if
> individual i chooses E="the value of interest" at time t, and zero
> otherwise. variables in X are assumed to be exogenous.
> I believe I have a good instrument for Z, along for other control
> demographic variables, and therefore I guess I have basically two
> choices in order to take into account the panel nature of my dataset
> 1) using ivregress2 with the option cluster(id) and correcting for the
> endogenous part with (Z= instrument + age + location of birth).
> However Z is a dummy variable... I know this should not be a problem
> but...
> 2) using treatreg with the option vce(bootstrap, cluster(id)
> reps(400)) and modeling the choice of E=2 (that is Z=1) with treat(Z=
> instrument + age + location of birth)
> 3) I tried to use xtivreg 2 with fixed effects, but location of birth
> is time invariant (and I believe very important in order to understand
> Z) so it cannot be estimated.
> Is my approach correct ? Do you have eventually other ways to tacke
> this multiple choice endogenous problem ?
> Moreover, in the context of panel data, do I always need to use
> clustering on id in order to have correct standard errors ?
> My dataset is large, but I have much more time variation than
> clusters. About 200 000 individuals and 10 million observations for
> the whole dataset.
> The period where the instrument is available reduces the dataset
> considerably : 1 million observations and about 20 000 individuals.
> An important remark : the panel is NOT balanced. So individuals could
> come in and out of the dataset during the 10 year period covered by my
> dataset. Some have thus very few observations, and some have hundreds
> of rows.
> Many thanks in advance for your suggestions,
> Best,
> *
> * For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index