Hewan, > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Hewan Belay > Sent: 09 July 2008 22:05 > To: statalist@hsphsun2.harvard.edu > Subject: st: RE: xtoverid error: internal reestimation of eqn > differs from original > > Dear Mark, > > Interesting points, thanks for your insights. It seems quite > hard to get the -xtoverid, noisily- command to produce > intelligible variable names. I followed your advice and > created the lagged variables "by hand", and reran -xthtaylor- > using these variables, and the temporary variables of > -xtoverid, noi- still didn't resemble the vars in my model. > Then, to narrow things down, I generated the two endogenous > vars, which happen to be lagged variables, using very short > names for them, and still no luck. Specifically, the two > endog vars are L.rev_EXT and L.rev_IGF_3avg. I named them ex > and ig respectively, and reran. The endogenous var that's > being reclassified as exogenous still has a nondescript > temporary name __000019, as opposed to something resembling > "ex" or "ig". I'm suspecting the problem child is the > endogenous variable L.rev_IGF_3avg (ie a LDV), and not > L.rev_EXT. Because when I run the regression with only the > latter being endog, xtoverid works > fine. When I run it with only the former being endog, > xtoverid gives the error message we're talking about. > > If indeed L.rev_IGF_3avg is perfectly collinear with some > combination of the remaining variables, and if (as you're > suspecting) xthtaylor somehow ignores that fact, at least for > sure a standard OLS should drop one or more variables in the > presence of perfect collinearity, right? Here's a guess... Internally, the Hausman-Taylor estimator works by creating new variables based on the existing ones. Specificially, it will take a variable and (depending on how you specify it) create a new variable that is a group mean (constant within groups, different across groups) and/or another new variable that is a mean-deviation (different within groups but the group mean is zero). Another new variable that can be created would be a GLS transform combining the group means and demeans. Some or all of the temporary variables that -xtoverid- is reporting are these new variables. The -ec2sls- estimator for -xtivreg- does the same sort of thing. Now Stata's -xthtaylor- and -xtivreg2,ec2sls- do something a bit odd with these variables. Say you have a variable X that is a time-varying exogenous regressor. You transform X so that you have two new variables X_M and X_DM (group mean and demeaned, respectively). You then combine X and X_DM to get the GLS transform, which would be X_GLS = (X - theta*X_DM), where theta is a scalar if it's a balanced panel and a vector if it's unbalanced. In the transformed regression to get -xthtaylor- results, X_GLS is an exogenous regressor. If the panel is balanced, then either X_M OR X_DM is available as a new excluded instrument, but NOT both. This is because, in a balanced panel, X_M and X_DM together are perfectly collinear with the regressor X_GLS; the theta in the previous paragraph is a scalar. If the panel is unbalanced, X_M and X_DM are not perfectly collinear with X_GLS, because the theta is a vector, not a scalar. You could use both X_M and X_DM as excluded instruments and not have perfect collinearity. It might be almost collinear (if all the elements of the theta vector are very similar to each other), but not exactly collinear. But it wouldn't be a sensible thing to do, since the second instrument would be adding very little to the first. But which to use? X_M or X_DM? Here is the odd bit: what Stata does internally is treat X_GLS as another *endogenous* regressor, and uses *both* X_M and X_DM as instruments. It looks odd, but it makes sense: by adding 1 endogenous regressor and 2 excluded instruments, the degree of overidentification is going up by 1, which is right. My guess is that you have an X_GLS that is, for some reason, collinear with some of the excluded instruments. A way to test this is to limit your estimation sample to a balanced panel. -xtoverid- checks for this and knows that the X_GLS variables should be treated as exogenous regressors. If you don't get an error message, that's probably it. NB: For anyone who read this far, I have a working version of xtivreg2 that generalizes to fixed effects, random effects, EC2SLS, Hausman-Taylor, G2SLS, etc. Here is the current syntax for replicating a Hausman-Taylor estimation: * H-T estimation xthtaylor ln_w age age2 tenure hours black birth_yr grade, /* */ endog(tenure hours grade) constant(black birth_yr grade) i(idcode) * xtivreg3 estimation, balanced panel xtivreg3 ln_w age age2 black birth_yr (tenure hours grade=), /* Endog vars in () */ ivdm( (=age age2 tenure hours) ) /* Excl IVs, demeaned */ gls i(idcode) small /* Apply GLS transform to */ /* dep var & regressors */ * xtivreg3 estimation, unbalanced panel xtivreg3 ln_w (age age2 tenure hours black birth_yr grade=), /* Endog vars in () */ ivm( (=age age2 black birth_yr) ) /* Excl IVs, means */ ivdm( (=age age2 tenure hours) ) /* Excl IVs, demeaned */ gls i(idcode) small /* Apply GLS transform to */ /* dep var & regressors */ Note how, in the unbalanced case, exogenous regressors get added to the endog variable list as GLS transforms, and to the mean and demeaned instruments list. Someday I will get around to finishing this and releasing it.... Cheers, Mark Prof. Mark Schaffer Director, CERT Department of Economics School of Management & Languages Heriot-Watt University Edinburgh EH14 4AS tel +44-131-451-3494 / fax +44-131-451-3296 email: m.e.schaffer@hw.ac.uk web: http://www.sml.hw.ac.uk/ecomes > But when I do a > simple -regress- on all the variables in this model, nothing > drops out (I can send you the output on that). Does that not > rule out perfect collinearity being the cause of the xtoverid > problem? Is xtoverid sensitive to strong (but non-perfect) > collinearity? > > Please see the output below for details on my above described > effort with the two endog vars renamed "ex" and "ig". I would > very much appreciate any further thoughts you may have on > this mystery. > > Thanks, > Hewan > .. g ex = L.rev_EXT > (265 missing values generated) > > .. g ig = L.rev_IGF_3avg > (440 missing values generated) > > .. xthtaylor rev_IGF_3avg popurb_share popdens pop p0 rain_av > road_no literate rel_christ *akan *ewe ex ig L.(ex > > p_pers_act exp_NPR exp_cap_act) dumreg1-dumreg7 > dumreg9-dumreg10, endog(ex ig) varying(ex ig L.(exp_pers_act > > exp_NPR exp_cap_act)) > > Hausman-Taylor estimation Number of obs > = 699 > Group variable: code Number of > groups = 106 > > Obs per group: min = 5 > avg = 6.6 > max = 7 > > Random effects u_i ~ i.i.d. Wald chi2(24) > = 5417.96 > Prob > chi2 = 0.0000 > > > rev_IGF_3avg Coef. Std. Err. z P>z [95% > Conf. Interval] > > TVexogenous > exp_pers_act > L1. .0788914 .0169636 4.65 0.000 .0456433 .1121394 > exp_NPR > L1. .1992826 .0252538 7.89 0.000 .1497861 .248779 > exp_cap_act > L1. -.0038548 .0113339 -0.34 0.734 -.0260688 .0183592 > TVendogenous > ex .0068097 .0148884 0.46 0.647 -.0223711 .0359905 > ig .6608472 .0284122 23.26 0.000 .6051602 .7165341 > TIexogenous > popurb_share .0011771 .000573 2.05 0.040 > .0000541 .0023002 > popdens -.0001731 .0000585 -2.96 0.003 -.0002877 > -.0000584 > pop .0001918 .0001302 1.47 0.141 -.0000633 .0004469 > p0 -.3915243 .1268809 -3.09 0.002 -.6402062 -.1428423 > rain_av .0000586 .0000765 0.77 0.444 -.0000913 > .0002086 > road_no -.0120262 .0741182 -0.16 0.871 -.1572952 > .1332429 > literate .0035863 .0017501 2.05 0.040 .0001562 > .0070164 > rel_christ -.0016295 .0012886 -1.26 0.206 > -.0041552 .0008962 > ethn_akan -.0008522 .0006921 -1.23 0.218 > -.0022086 .0005043 > ethn_ewe .0004666 .0008781 0.53 0.595 -.0012544 > .0021877 > dumreg1 .0913287 .0850729 1.07 0.283 -.0754111 > .2580684 > dumreg2 -.0432531 .0780054 -0.55 0.579 -.1961408 > .1096345 > dumreg3 .0388221 .0882678 0.44 0.660 -.1341796 > .2118237 > dumreg4 -.0929884 .0735551 -1.26 0.206 -.2371536 > .0511769 > dumreg5 -.0559623 .0740561 -0.76 0.450 -.2011097 > .0891851 > dumreg6 .0299387 .0740921 0.40 0.686 -.1152792 > .1751566 > dumreg7 -.044829 .0717926 -0.62 0.532 -.1855399 > .0958819 > dumreg9 .1245367 .059701 2.09 0.037 .0075249 > .2415485 > dumreg10 .1489092 .062665 2.38 0.017 .0260881 > .2717303 > > _cons .5605179 .2319658 2.42 0.016 .1058733 1.015162 > > sigma_u .01675734 > sigma_e .21148494 > rho .00623926 (fraction of variance due to u_i) > > Note: TV refers to time varying; TI refers to time invariant. > > .. xtoverid, noisily > Warning - endogenous variable(s) collinear with instruments > Vars now exogenous: __000019 > > Unable to display summary of first-stage estimates; macro > e(first) is missing > > IV (2SLS) estimation > > > Estimates efficient for homoskedasticity only Statistics > consistent for homoskedasticity only > > Number of obs = 699 > F( 25, 674) = 34571.80 > Prob > F = 0.0000 > Total (centered) SS = 292.3090099 > Centered R2 = 0.8935 > Total (uncentered) SS = 39947.58709 > Uncentered R2 = 0.9992 > Residual SS = 31.11655374 Root > MSE = .2149 > > > __00000I Coef. Std. Err. t P>t [95% Conf. > Interval] > > __00000M .0788883 .0169636 4.65 0.000 .0455804 > .1121962 > __00000P .1992816 .0252538 7.89 0.000 .149696 > .2488672 > __00000S -.0038579 .0113339 -0.34 0.734 -.0261119 > .018396 > __00000V .006814 .0148884 0.46 0.647 -.0224193 > .0360473 > __00000Y .660834 .0284121 23.26 0.000 .6050471 > .7166208 > __00000Z .0011771 .000573 2.05 0.040 .000052 > .0023022 > __000010 -.000173 .0000585 -2.96 0.003 -.0002879 > -.0000582 > __000011 .0001918 .0001302 1.47 0.141 -.0000638 > .0004474 > __000012 -.3915953 .1268798 -3.09 0.002 -.6407224 > -.1424682 > __000013 .0000586 .0000765 0.77 0.444 -.0000916 > .0002088 > __000014 -.0120329 .0741184 -0.16 0.871 -.1575636 > .1334978 > __000015 .0035862 .0017501 2.05 0.041 .0001499 > .0070224 > __000016 -.0016296 .0012886 -1.26 0.206 -.0041598 > .0009007 > __000017 -.0008521 .0006921 -1.23 0.219 -.002211 > .0005068 > __000018 .0004667 .0008781 0.53 0.595 -.0012575 > .0021908 > __00001A -.0432515 .0780055 -0.55 0.579 -.1964146 > .1099115 > __00001B .03883 .0882679 0.44 0.660 -.1344831 > .2121432 > __00001C -.0929823 .0735552 -1.26 0.207 -.2374072 > .0514425 > __00001D -.0559592 .0740563 -0.76 0.450 -.201368 > .0894496 > __00001E .0299385 .0740923 0.40 0.686 -.1155409 > .175418 > __00001F -.0448234 .0717927 -0.62 0.533 -.1857877 > .0961408 > __00001G .124552 .059701 2.09 0.037 .0073298 > .2417743 > __00001H .1489231 .062665 2.38 0.018 .025881 > .2719652 > __00000H .5606982 .2319614 2.42 0.016 .1052443 > 1.016152 > __000019 .0913319 .085073 1.07 0.283 -.0757082 > .2583719 > > Sargan statistic (overidentification test of all > instruments): 9.249 > Chi-sq(4) P-val = 0.0552 > > Instrumented: __00000M __00000P __00000S __00000V > __00000Y __00000Z > __000010 __000011 __000012 __000013 __000014 __000015 > __000016 __000017 __000018 __000019 __00001A __00001B > __00001C __00001D __00001E __00001F __00001G __00001H > Included instruments: __00000H Excluded instruments: __00000L > __00000O __00000R __00000U __00000X __00000K __00000N > __00000Q popurb_share popdens pop p0 rain_av road_no literate > rel_christ ethn_akan ethn_ewe dumreg1 > dumreg2 dumreg3 dumreg4 dumreg5 dumreg6 dumreg7 dumreg9 > dumreg10 Reclassified as exog: __000019 > > xtoverid error: internal reestimation of eqn differs from > original r(198); > > end of do-file > > r(198); > > > > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Heriot-Watt University is a Scottish charity registered under charity number SC000278. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

