Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !

From   John Litfiba <>
Subject   Re: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !
Date   Mon, 14 Nov 2011 12:02:12 +0100

Dear Cameron,

Thanks for you answer.
Let me precise thus the following

The average number of points per individual is about 70. There is no
doubt that choices of the endogenous variable whithin an individual
are correlated : some will always choose 2 particular values, some
will choose other values, etc.
The endonegous variable is not naturally ordered.

When I run treatreg as described before, without any correction for
clustering, t stats are great
Of course when I add the option vce(bootstrap cluster(id) reps(400) )
every significance vanish (maybe my instrument is not as good as
expected?)... but do I really need to correct for clustering here?
I mean, treatreg is not really designed for panel data right ?


On 14 November 2011 06:08, Cameron McIntosh <> wrote:
> John,
> I strongly suggest that you take a look at the following (as for clustered observations and missing data, you haven't really described the dataset enough to be able to comment -- how many time points are there?):
> Greene, W.H., & Hensher, D.A. (2010). Modeling Ordered Choices: A Primer. Cambridge, UK: Cambridge University Press.
> Chesher, A., & Smolinski, K. (2012). IV models of ordered choice. Journal of Econometrics, 166(1), 33-48.
> Carrasco, R. (2001). Binary Choice With Binary Endogenous Regressors in Panel Data. Journal of Businessand Economic Statistics, 19(4), 385-394.
> Chesher, A., Rosen, A., & Smolinski, K. (February 11, 2011).  An instrumental variable model of multiple discrete choice. cemmap working paper CWP06/11. The Institute for Fiscal Studies Department of Economics, UCL.
> Mullahy, J. (2001). [Estimation of Limited Dependent Variable Models with Dummy Endogenous Regressors: Simple Strategies for Empirical Practice]: Comment. Journal of Business & Economic Statistics, 19(1), 23-25.
> Altonji, J.G., & Matzkin, R.L. (2005). Cross Section And Panel Data Estimators For Nonseparable Models With Endogenous Regressors.  Econometrica, 73(4), 1053-1101.
> Dong, Y., & Lewbel, A. (April 2011). Simple Estimators for Binary Choice Models With Endogenous Regressors.
> Lewbel, A. (March 2011). Binary Choice With Endogenous Or Mismeasured Regressors.
> Cam
> ----------------------------------------
>> From:
>> Date: Sun, 13 Nov 2011 15:41:46 +0100
>> Subject: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !
>> To:
>> Dear Stata List,
>> Dear Mark Schaffer (I guess ;-) )
>> I have a econometric question related to endogenous variables and
>> panel data, and I believe that it can be interesting for anyone who
>> uses longitudinal data.
>> Here's the context :
>> I have a panel dataset of individuals who, at any time t, could
>> endogenously chose the value of a variable E (for endogenous). E is
>> not ordered and could take few values (in my case, 6 possible
>> choices).
>> I am particularly interested in the effect of one of these choices on
>> a fully continuous outcome variable Y.
>> That is, at any time and for any individual I would like to estimate
>> Yit=a+bXit+cZit+eit
>> where for example, Z is a binary variable that is equals to 1 if
>> individual i chooses E="the value of interest" at time t, and zero
>> otherwise. variables in X are assumed to be exogenous.
>> I believe I have a good instrument for Z, along for other control
>> demographic variables, and therefore I guess I have basically two
>> choices in order to take into account the panel nature of my dataset
>> 1) using ivregress2 with the option cluster(id) and correcting for the
>> endogenous part with (Z= instrument + age + location of birth).
>> However Z is a dummy variable... I know this should not be a problem
>> but...
>> 2) using treatreg with the option vce(bootstrap, cluster(id)
>> reps(400)) and modeling the choice of E=2 (that is Z=1) with treat(Z=
>> instrument + age + location of birth)
>> 3) I tried to use xtivreg 2 with fixed effects, but location of birth
>> is time invariant (and I believe very important in order to understand
>> Z) so it cannot be estimated.
>> Is my approach correct ? Do you have eventually other ways to tacke
>> this multiple choice endogenous problem ?
>> Moreover, in the context of panel data, do I always need to use
>> clustering on id in order to have correct standard errors ?
>> My dataset is large, but I have much more time variation than
>> clusters. About 200 000 individuals and 10 million observations for
>> the whole dataset.
>> The period where the instrument is available reduces the dataset
>> considerably : 1 million observations and about 20 000 individuals.
>> An important remark : the panel is NOT balanced. So individuals could
>> come in and out of the dataset during the 10 year period covered by my
>> dataset. Some have thus very few observations, and some have hundreds
>> of rows.
>> Many thanks in advance for your suggestions,
>> Best,
>> *
>> * For searches and help try:
>> *
>> *
>> *
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index