[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Misha Spisok <misha.spisok@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Instrumental Variables - Consistency with Nonlinear Endogenous Variable |

Date |
Tue, 29 Dec 2009 00:26:04 -0800 |

Hello, Statalist! In Microeconometrics Using Stata, by Cameron and Trivedi, Exercise 11 of Chapter 6 (page 204) has a simulation exercise which I will give below, followed by my "solution," then my question. Apologies in advance for the length of this message. The basic question has to do with the consistency of the IV estimator when the endogenous variable enters the structural equation nonlinearly. "When an endogenous variable enters the regression nonlinearly, the obvious IV estimator is inconsistent and a modification is needed. Specifically, suppose y1 = b*y2^2 + u, and the first-stage equation for y2 is y2 = p*z + v, where the zero-mean errors u and v are correlated. Here the endogenous regressor appears in the structural equation as y2^2 rather than y2. The IV estimator is b_hat_IV = (sum z_i * y2_i^2)^(-1)*(sum z_i * y1_i). This can be implemented by a regular IV regression of y on y2^2 with the instrument z: regress y2^2 on z and then regress y1 on the first-stage prediction y2^2_hat. If instead we regress y2 on z at the first stage, giving y2_hat, and then regress y1 on (y2_hat)^2, an inconsistent estimate is obtained. Generate a simulation sample to demonstrate these points. Consider whether this example can be generalized to other nonlinear models where the nonlinearity is in regressors only, so that y1 = g(y2)'beta + u, where g(y2) is a nonlinear function of y2 [y2 being a vector of variables]." (Microeconometrics Using Stata, Cameron and Trivedi) Here is my approach: clear set seed 10101 quietly set obs 10000 generate double z = 5*rnormal(0) /* instrument */ generate double x = 5*rnormal(0) matrix C = (1, -0.5 \ -0.5, 1) /* correlation structure */ corr2data u v, corr(C) /* correlated errors */ generate double y2 = 3*z + v /* endogenous variable due to correlation with error, u */ generate y2sq = y2^2 generate double y1 = 5 + 2*y2sq + x + u reg y2sq z x predict y2sq_hat, xb reg y1 y2sq_hat x, robust reg y2 z x predict y2_hat, xb generate y2_hat_sq = y2_hat^2 reg y1 y2_hat_sq x, robust The coefficient estimates on both y2sq_hat and y2_hat_sq are near the actual coefficient of 2, however, the standard error for the estimate deemed inconsistent is remarkably small, yielding a t-value of 452, while the standard error for the estimate deemed consistent is slightly larger than the parameter estimate, yielding t-value of less than one. My generated data do not make it clear that one estimate is inconsistent while the other is consistent. Moreover, the one deemed inconsistent is not only close to the actual coefficient, but its standard error is quite small. What am I doing wrong? Thank you for your attention and help. Misha * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: -spmap- filling map polygons with patterns instead of color** - Next by Date:
**st: RE: Re: Predicting Random Effects from a Crossed-Level Model using xtmelogit** - Previous by thread:
**st: -spmap- filling map polygons with patterns instead of color** - Next by thread:
**st: re: exiting Stata from within Stata** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |