Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "sdm1" <sdm1@york.ac.uk> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: RE: ivreg2 with weight: results slightly different to those 3 months ago |
Date | Mon, 18 Oct 2010 18:20:52 +0100 |
I see what you mean! Thanks. Steve -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox Sent: 18 October 2010 18:12 To: 'statalist@hsphsun2.harvard.edu' Subject: st: RE: ivreg2 with weight: results slightly different to those 3 months ago Look inside the code for a log (copied at the end of this post). This probably deserves some kind of prize for documentation of changes. Nick n.j.cox@durham.ac.uk Steve I have been re-running a do file that I last ran in July and I have been getting slightly different results for most parts of the regression output when using -ivreg2- with weights (-ivreg2- without weights generates identical results). For example, the Kleibergen-Paap rk LM statistic and Kleibergen-Paap Wald rk F statistic are the same as they were in July but most of the output changes slightly (including the regression coefficients). I am now using: . which ivreg2 c:\ado\plus\i\ivreg2.ado *! ivreg2 3.0.05 17June2010 I guess that I must have updated the ado file sometime between July and now because today's log file includes Angrist-Pischke (AP) statistics but the July log file does not (it first reports Shea's Partial R2). I do not know which version of ivreg2 I was using previously. Sorry! In both cases I am estimating the same model: ivreg2 lallcancersdsmr lneed89HPandG (lg2_89expphHPandG=llonepenhx lpoppucarx) [w=rawpop89], gmm2s robust ffirst endog(lg2_89expphHPandG) Have changes been made to -ivreg2- (perhaps fairly recently) which would account for this? ********************************** VERSION COMMENTS ********************************** * Initial version cloned from official ivreg version 5.0.9 19Dec2001 * 1.0.2: add logic for reg3. Sargan test * 1.0.3: add prunelist to ensure that count of excluded exogeneous is correct * 1.0.4: revise option to exog(), allow included exog to be specified as well * 1.0.5: switch from reg3 to regress, many options and output changes * 1.0.6: fixed treatment of nocons in Sargan and C-stat, and corrected problems * relating to use of nocons combined with a constant as an IV * 1.0.7: first option reports F-test of excluded exogenous; prunelist bug fix * 1.0.8: dropped prunelist and switched to housekeeping of variable lists * 1.0.9: added collinearity checks; C-stat calculated with recursive call; * added ffirst option to report only F-test of excluded exogenous * from 1st stage regressions * 1.0.10: 1st stage regressions also report partial R2 of excluded exogenous * 1.0.11: complete rewrite of collinearity approach - no longer uses calls to * _rmcoll, does not track specific variables dropped; prunelist removed * 1.0.12: reorganised display code and saved results to enable -replay()- * 1.0.13: -robust- and -cluster- now imply -small- * 1.0.14: fixed hascons bug; removed ivreg predict fn (it didn't work); allowed * robust and cluster with z stats and correct dofs * 1.0.15: implemented robust Sargan stat; changed to only F-stat, removed chi-sq; * removed exog option (only orthog works) * 1.0.16: added clusterised Sargan stat; robust Sargan handles collinearities; * predict now works with standard SE options plus resids; fixed orthog() * so it accepts time series operators etc. * 1.0.17: fixed handling of weights. fw, aw, pw & iw all accepted. * 1.0.18: fixed bug in robust Sargan code relating to time series variables. * 1.0.19: fixed bugs in reporting ranks of X'X and Z'Z * fixed bug in reporting presence of constant * 1.0.20: added GMM option and replaced robust Sargan with (equivalent) J; * added saved statistics of 1st stage regressions * 1.0.21: added Cragg HOLS estimator, including allowing empty endog list; * -regress- syntax now not allowed; revised code searching for "_cons" * 1.0.22: modified cluster output message; fixed bug in replay for Sargan/Hansen stat; * exactly identified Sargan/Hansen now exactly zero and p-value not saved as e(); * cluster multiplier changed to 1 (from buggy multiplier), in keeping with * eg Wooldridge 2002 p. 193. * 1.0.23: fixed orthog option to prevent abort when restricted equation is underid. * 1.0.24: fixed bug if 1st stage regressions yielded missing values for saving in e(). * 1.0.25: Added Shea version of partial R2 * 1.0.26: Replaced Shea algorithm with Godfrey algorithm * 1.0.27: Main call to regress is OLS form if OLS or HOLS is specified; error variance * in Sargan and C statistics use small-sample adjustment if -small- option is * specified; dfn of S matrix now correctly divided by sample size * 1.0.28: HAC covariance estimation implemented * Symmetrize all matrices before calling syminv * Added hack to catch F stats that ought to be missing but actually have a * huge-but-not-missing value * Fixed dof of F-stat - was using rank of ZZ, should have used rank of XX (couldn't use df_r * because it isn't always saved. This is because saving df_r triggers small stats * (t and F) even when -post- is called without dof() option, hence df_r saved only * with -small- option and hence a separate saved macro Fdf2 is needed. * Added rankS to saved macros * Fixed trap for "no regressors specified" * Added trap to catch gmm option with no excluded instruments * Allow OLS syntax (no endog or excluded IVs specified) * Fixed error messages and traps for rank-deficient robust cov matrix; includes * singleton dummy possibility * Capture error if posting estimated VCV that isn't pos def and report slightly * more informative error message * Checks 3 variable lists (endo, inexog, exexog) separately for collinearities * Added AC (autocorrelation-consistent but conditionally-homoskedastic) option * Sargan no longer has small-sample correction if -small- option * robust, cluster, AC, HAC all passed on to first-stage F-stat * bw must be < T * 1.0.29 -orthog- also displays Hansen-Sargan of unrestricted equation * Fixed collinearity check to include nocons as well as hascons * Fixed small bug in Godfrey-Shea code - macros were global rather than local * Fixed larger bug in Godfrey-Shea code - was using mixture of sigma-squares from IV and OLS * with and without small-sample corrections * Added liml and kclass * 1.0.30 Changed order of insts macro to match saved matrices S and W * 2.0.00 Collinearities no longer -qui- * List of instruments tested in -orthog- option prettified * 2.0.01 Fixed handling of nocons with no included exogenous, including LIML code * 2.0.02 Allow C-test if unrestricted equation is just-identified. Implemented by * saving Hansen-Sargan dof as = 0 in e() if just-identified. * 2.0.03 Added score() option per latest revision to official ivreg * 2.0.04 Changed score() option to pscore() per new official ivreg * 2.0.05 Fixed est hold bug in first-stage regressions * Fixed F-stat finite sample adjustment with cluster option to match official Stata * Fixed F-stat so that it works with hascons (collinearity with constant is removed) * Fixed bug in F-stat code - wasn't handling failed posting of vcv * No longer allows/ignores nonsense options * 2.0.06 Modified lsStop to sync with official ivreg 5.1.3 * 2.0.07a Working version of CUE option * Added sortpreserve, ivar and tvar options * Fixed smalls bug in calculation of T for AC/HAC - wasn't using the last ob * in QS kernel, and didn't take account of possible dropped observations * 2.0.07b Fixed macro bug that truncated long varlists * 2.0.07c Added dof option. * Changed display of RMSE so that more digits are displayed (was %8.1g) * Fixed small bug where cstat was local macro and should have been scalar * Fixed bug where C stat failed with cluster. NB: wmatrix option and cluster are not compatible! * 2.0.7d Fixed bug in dof option * 2.1.0 Added first-stage identification, weak instruments, and redundancy stats * 2.1.01 Tidying up cue option checks, reporting of cue in output header, etc. * 2.1.02 Used Poskitt-Skeels (2002) result that C-D eval = cceval / (1-cceval) * 2.1.03 Added saved lists of separate included and excluded exogenous IVs * 2.1.04 Added Anderson-Rubin test of signif of endog regressors * 2.1.05 Fix minor bugs relating to cluster and new first-stage stats * 2.1.06 Fix bug in cue: capture estimates hold without corresponding capture on estimates unhold * 2.1.07 Minor fix to ereturn local wexp, promote to version 8.2 * 2.1.08 Added dofminus option, removed dof option. Added A-R test p-values to e(). * Minor bug fix to A-R chi2 test - was N chi2, should have been N-L chi2. * Changed output to remove potentially misleading refs to N-L etc. * Bug fix to rhs count - sometimes regressors could have exact zero coeffs * Bug fix related to cluster - if user omitted -robust-, orthog would use Sargan and not J * Changed output of Shea R2 to make clearer that F and p-values do not refer to it * Improved handling of collinearites to check across inexog, exexog and endo lists * Total weight statement moved to follow summ command * Added traps to catch errors if no room to save temporary estimations with _est hold * Added -savefirst- option. Removed -hascons-, now synonymous with -nocons-. * 2.1.09 Fixes to dof option with cluster so it no longer mimics incorrect areg behavior * Local ivreg2_cmd to allow testing under name ivreg2 * If wmatrix supplied, used (previously not used if non-robust sargan stat generated) * Allowed OLS using (=) syntax (empty endo and exexog lists) * Clarified error message when S matrix is not of full rank * cdchi2p, ardf, ardf_r added to saved macros * first and ffirst replay() options; DispFirst and DispFFirst separately codes 1st stage output * Added savefprefix, macro with saved first-stage equation names. * Added version option. * Added check for duplicate variables to collinearity checks * Rewrote/simplified Godfrey-Shea partial r2 code * 2.1.10 Added NOOUTput option * Fixed rf bug so that first does not trigger unnecessary saved rf * Fixed cue bug - was not starting with robust 2-step gmm if robust/cluster * 2.1.11 Dropped incorrect/misleading dofminus adjustments in first-stage output summary * 2.1.12 Collinearity check now checks across inexog/exexog/endog simultaneously * 2.1.13 Added check to catch failed first-stage regressions * Fixed misleading failed C-stat message * 2.1.14 Fixed mishandling of missing values in AC (non-robust) block * 2.1.15 Fixed bug in RF - was ignoring weights * Added -endog- option * Save W matrix for all cases; ensured copy is posted with wmatrix option so original isn't zapped * Fixed cue bug - with robust, was entering IV block and overwriting correct VCV * 2.1.16 Added -fwl- option * Saved S is now robust cov matrix of orthog conditions if robust, whereas W is possibly non-robust * weighting matrix used by estmator. inv(S)=W if estimator is efficient GMM. * Removed pscore option (dropped by official ivreg). * Fixed bug where -post- would fail because of missing values in vcv * Remove hascons as synonym for nocons * OLS now outputs 2nd footer with variable lists * 2.1.17 Reorganization of code * Added ll() macro * Fixed N bug where weights meant a non-integer ob count that was rounded down * Fixed -fwl- option so it correctly handles weights (must include when partialling-out) * smatrix option takes over from wmatrix option. Consistent treatment of both. * Saved smatrix and wmatrix now differ in case of inefficient GMM. * Added title() and subtitle() options. * b0 option returns a value for the Sargan/J stat even if exactly id'd. * (Useful for S-stat = value of GMM objective function.) * HAC and AC now allowed with LIML and k-class. * Collinearity improvements: bug fixed because collinearity was mistakenly checked across * inexog/exexog/endog simultaneously; endog predicted exactly by IVs => reclassified as inexog; * _rmcollright enforces inexog>endo>exexog priority for collinearities, if Stata 9.2 or later. * K-class, LIML now report Sargan and J. C-stat based on Sargan/J. LIML reports AR if homosked. * nb: can always easily get a C-stat for LIML based on diff of two AR stats. * Always save Sargan-Hansen as e(j); also save as e(sargan) if homoskedastic. * Added Stock-Watson robust SEs options sw() * 2.1.18 Added Cragg-Donald-Stock-Yogo weak ID statistic critical values to main output * Save exexog_ct, inexog_ct and endog_ct as macros * Stock-Watson robust SEs now assume ivar is group variable * Option -sw- is standard SW. Option -swpsd- is PSD version a la page 6 point 10. * Added -noid- option. Suppresses all first-stage and identification statistics. * Internal calls to ivreg2 use noid option. * Added hyperlinks to ivreg2.hlp and helpfile argument to display routines to enable this. * 2.1.19 Added matrix rearrangement and checks for smatrix and wmatrix options * Recursive calls to cstat simplified - no matrix rearrangement or separate robust/nonrobust needed * Reintroduced weak ID stats to ffirst output * Added robust ID stats to ffirst output for case of single endogenous regressor * Fixed obscure bug in reporting 1st stage partial r2 - would report zero if no included exogenous vars * Removed "HOLS" in main output (misleading if, e.g., estimation is AC but not HAC) * Removed "ML" in main output if no endogenous regressors - now all ML is labelled LIML * model=gmm is now model=gmm2s; wmatrix estimation is model=gmm * wmatrix relates to gmm estimator; smatrix relates to gmm var-cov matrix; b0 behavior equiv to wmatrix * b0 option implies nooutput and noid options * Added nocollin option to skip collinearity checks * Fixed minor display bug in ffirst output for endog vars with varnames > 12 characters * Fixed bug in saved rf and first-stage results for vars with long varnames; uses permname * Fixed bug in model df - had counted RHS, now calculates rank(V) since latter may be rank-deficient * Rank of V now saved as macro rankV * fwl() now allows partialling-out of just constant with _cons * Added Stock-Wright S statistic (but adds overhead - calls preserve) * Properties now include svyj. * Noted only: fwl bug doesn't allow time-series operators. * 2.1.20 Fixed Stock-Wright S stat bug - didn't allow time-series operators * 2.1.21 Fixed Stock-Wright S stat to allow for no exog regressors cases * 2.2.00 CUE partials out exog regressors, estimates endog coeffs, then exog regressors separately - faster * gmm2s becomes standard option, gmm supported as legacy option * 2.2.01 Added explanatory messages if gmm2s used. * States if estimates efficient for/stats consistent for het, AC, etc. * Fixed small bug that prevented "{help `helpfile'##fwl:fwl}" from displaying when -capture-d. * Error message in footer about insuff rank of S changed to warning message with more informative message. * Fixed bug in CUE with weights. * 2.2.02 Removed CUE partialling-out; still available with fwl * smatrix and wmatrix become documented options. e(model)="gmmw" means GMM with arbitrary W * 2.2.03 Fixed bug in AC with aweights; was weighting zi'zi but not ei'ei. * 2.2.04 Added abw code for bw(), removed properties(svyj) * 2.2.05 Fixed bug in AC; need to clear variable vt1 at start of loop * If iweights, N (#obs with precision) rounded to nearest integer to mimic official Stata treatment * and therefore don't need N scalar at all - will be same as N * Saves fwl_ct as macro. * -ffirst- output, weak id stat, etc. now adjust for number of partialled-out variables. * Related changes: df_m, df_r include adjustments for partialled-out variables. * Option nofwlsmall introduced - suppresses above adjustments. Undocumented in ivreg2.hlp. * Replaced ID tests based on canon corr with Kleibergen-Paap rk-based stats if not homoskedastic * Replaced LR ID test stats with LM test stats. * Checks that -ranktest- is installed. * 2.2.06 Fixed bug with missing F df when cue called; updated required version of ranktest * 2.2.07 Modified redundancy test statistic to match standard regression-based LM tests * Change name of -fwl- option to -partial-. * Use of b0 means e(model)=CUE. Added informative b0 option titles. b0 generates output but noid. * Removed check for integer bandwidth if auto option used. * 2.2.08 Add -nocollin- to internal calls and to -ivreg2_cue- to speed performance. * 2.2.09 Per msg from Brian Poi, Alastair Hall verifies that Newey-West cited constant of 1.1447 * is correct. Corrected mata abw() function. Require -ranktest- 1.1.03. * 2.2.10 Added Angrist-Pischke multivariate f stats. Rewrite of first and ffirst output. * Added Cragg-Donald to weak ID output even when non-iid. * Fixed small bug in non-robust HAC code whereby extra obs could be used even if dep var missing. * (required addition of L`tau'.(`s1resid') in creation of second touse variable) * Fixed bugs that zapped varnames with "_cons" in them * Changed tvar and ivar setup so that data must be tsset or xtset. * Fixed bug in redundancy test stat when called by xtivreg2+cluster - no dofminus adj needed in this case * Changed reporting so that gaps between panels are not reported as such. * Added check that weight variable is not transformed by partialling out. * Changed Stock-Wright S statistic so that it uses straight partialling-out of exog regressors * (had been, in effect, doing 2SGMM partialling-out) * Fixed bug where dropped collinear endogenous didn't get a warning or listing * Removed N*CDEV Wald chi-sq statistic from ffirst output (LM stat enough) * 3.0.00 Fully rewritten and Mata-ized code. Require min Stata 10.1 and ranktest 1.2.00. * Mata support for Stock-Watson SEs for fixed effects estimator; doesn't support fweights. * Changed handling of iweights yielding non-integer N so that (unlike official -regress-) all calcs * for RMSE etc. use non-integer N and N is rounded down only at the end. * Added support for Thompson/Cameron-Gelbach-Miller 2-level cluster-robust vcvs. * 3.0.01 Now exits more gracefully if no regressors survive after collinearity checks * 3.0.02 -capture- instead of -qui- before reduced form to suppress not-full-rank error warning * Modified Stock-Wright code to partial out all incl Xs first, to reduce possibility of not-full-rank * omega and missing sstat. Added check within Stock-Wright code to catch not-full-rank omega. * Fixed bug where detailed first-stage stats with cluster were disrupted if data had been tsset * using a different variables. * Fixed bug that didn't allow regression on just a constant. * Added trap for no observations. * Added trap for auto bw with panel data - not allowed. * 3.0.03 Fixed bug in m_omega that always used Stock-Watson spectral decomp to create invertible shat * instead of only when (undocumented) spsd option is called. * Fixed bug where, if matsize too small, exited with wrong error (mistakenly detected as collinearities) * Removed inefficient call to -ranktest- that unnecessarily requested stats for all ranks, not just full. * 3.0.04 Fixed coding error in m_omega for cluster+kernel. Was *vcvo.e[tmatrix[.,1]], should have been (*vcvo.e)[tmatrix[.,1]]. * Fixed bug whereby clusters defined by strings were not handled correctly. * Updated ranktest version check * 3.0.05 Added check to catch unwanted transformations of time or panel variables by partial option. Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/