Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: ivreg2 with weight: results slightly different to those 3 months ago

From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: RE: ivreg2 with weight: results slightly different to those 3 months ago
Date   Mon, 18 Oct 2010 18:12:13 +0100

Look inside the code for a log (copied at the end of this post). This probably deserves some kind of prize for documentation of changes.

[email protected]


I have been re-running a do file that I last ran in July and I have been
getting slightly different results for most parts of the regression output
when using -ivreg2- with weights (-ivreg2- without weights generates
identical results).  For example, the Kleibergen-Paap rk LM statistic and
Kleibergen-Paap Wald rk F statistic are the same as they were in July but
most of the output changes slightly (including the regression coefficients).

I am now using:

. which ivreg2
*! ivreg2 3.0.05  17June2010

I guess that I must have updated the ado file sometime between July and now
because today's log file includes Angrist-Pischke (AP) statistics but the
July log file does not (it first reports Shea's Partial R2).  I do not know
which version of ivreg2 I was using previously.  Sorry!

In both cases I am estimating the same model:

ivreg2 lallcancersdsmr lneed89HPandG (lg2_89expphHPandG=llonepenhx
lpoppucarx) [w=rawpop89], gmm2s robust ffirst endog(lg2_89expphHPandG)

Have changes been made to -ivreg2- (perhaps fairly recently) which would
account for this?

********************************** VERSION COMMENTS **********************************
*  Initial version cloned from official ivreg version 5.0.9  19Dec2001
*  1.0.2:  add logic for reg3. Sargan test
*  1.0.3:  add prunelist to ensure that count of excluded exogeneous is correct
*  1.0.4:  revise option to exog(), allow included exog to be specified as well
*  1.0.5:  switch from reg3 to regress, many options and output changes
*  1.0.6:  fixed treatment of nocons in Sargan and C-stat, and corrected problems
*          relating to use of nocons combined with a constant as an IV
*  1.0.7:  first option reports F-test of excluded exogenous; prunelist bug fix
*  1.0.8:  dropped prunelist and switched to housekeeping of variable lists
*  1.0.9:  added collinearity checks; C-stat calculated with recursive call;
*          added ffirst option to report only F-test of excluded exogenous
*          from 1st stage regressions
*  1.0.10: 1st stage regressions also report partial R2 of excluded exogenous
*  1.0.11: complete rewrite of collinearity approach - no longer uses calls to
*          _rmcoll, does not track specific variables dropped; prunelist removed
*  1.0.12: reorganised display code and saved results to enable -replay()-
*  1.0.13: -robust- and -cluster- now imply -small-
*  1.0.14: fixed hascons bug; removed ivreg predict fn (it didn't work); allowed
*          robust and cluster with z stats and correct dofs
*  1.0.15: implemented robust Sargan stat; changed to only F-stat, removed chi-sq;
*          removed exog option (only orthog works)
*  1.0.16: added clusterised Sargan stat; robust Sargan handles collinearities;
*          predict now works with standard SE options plus resids; fixed orthog()
*          so it accepts time series operators etc.
*  1.0.17: fixed handling of weights.  fw, aw, pw & iw all accepted.
*  1.0.18: fixed bug in robust Sargan code relating to time series variables.
*  1.0.19: fixed bugs in reporting ranks of X'X and Z'Z
*          fixed bug in reporting presence of constant
*  1.0.20: added GMM option and replaced robust Sargan with (equivalent) J;
*          added saved statistics of 1st stage regressions
*  1.0.21: added Cragg HOLS estimator, including allowing empty endog list;
*          -regress- syntax now not allowed; revised code searching for "_cons"
*  1.0.22: modified cluster output message; fixed bug in replay for Sargan/Hansen stat;
*          exactly identified Sargan/Hansen now exactly zero and p-value not saved as e();
*          cluster multiplier changed to 1 (from buggy multiplier), in keeping with
*          eg Wooldridge 2002 p. 193.
*  1.0.23: fixed orthog option to prevent abort when restricted equation is underid.
*  1.0.24: fixed bug if 1st stage regressions yielded missing values for saving in e().
*  1.0.25: Added Shea version of partial R2
*  1.0.26: Replaced Shea algorithm with Godfrey algorithm
*  1.0.27: Main call to regress is OLS form if OLS or HOLS is specified; error variance
*          in Sargan and C statistics use small-sample adjustment if -small- option is
*          specified; dfn of S matrix now correctly divided by sample size
*  1.0.28: HAC covariance estimation implemented
*          Symmetrize all matrices before calling syminv
*          Added hack to catch F stats that ought to be missing but actually have a
*          huge-but-not-missing value
*          Fixed dof of F-stat - was using rank of ZZ, should have used rank of XX (couldn't use df_r
*          because it isn't always saved.  This is because saving df_r triggers small stats
*          (t and F) even when -post- is called without dof() option, hence df_r saved only
*          with -small- option and hence a separate saved macro Fdf2 is needed.
*          Added rankS to saved macros
*          Fixed trap for "no regressors specified"
*          Added trap to catch gmm option with no excluded instruments
*          Allow OLS syntax (no endog or excluded IVs specified)
*          Fixed error messages and traps for rank-deficient robust cov matrix; includes
*          singleton dummy possibility
*          Capture error if posting estimated VCV that isn't pos def and report slightly
*          more informative error message
*          Checks 3 variable lists (endo, inexog, exexog) separately for collinearities
*          Added AC (autocorrelation-consistent but conditionally-homoskedastic) option
*          Sargan no longer has small-sample correction if -small- option
*          robust, cluster, AC, HAC all passed on to first-stage F-stat
*          bw must be < T
*  1.0.29  -orthog- also displays Hansen-Sargan of unrestricted equation
*          Fixed collinearity check to include nocons as well as hascons
*          Fixed small bug in Godfrey-Shea code - macros were global rather than local
*          Fixed larger bug in Godfrey-Shea code - was using mixture of sigma-squares from IV and OLS
*            with and without small-sample corrections
*          Added liml and kclass
*  1.0.30  Changed order of insts macro to match saved matrices S and W
*  2.0.00  Collinearities no longer -qui-
*          List of instruments tested in -orthog- option prettified
*  2.0.01  Fixed handling of nocons with no included exogenous, including LIML code
*  2.0.02  Allow C-test if unrestricted equation is just-identified.  Implemented by
*          saving Hansen-Sargan dof as = 0 in e() if just-identified.
*  2.0.03  Added score() option per latest revision to official ivreg
*  2.0.04  Changed score() option to pscore() per new official ivreg
*  2.0.05  Fixed est hold bug in first-stage regressions
*          Fixed F-stat finite sample adjustment with cluster option to match official Stata
*          Fixed F-stat so that it works with hascons (collinearity with constant is removed)
*          Fixed bug in F-stat code - wasn't handling failed posting of vcv
*          No longer allows/ignores nonsense options
*  2.0.06  Modified lsStop to sync with official ivreg 5.1.3
*  2.0.07a Working version of CUE option
*          Added sortpreserve, ivar and tvar options
*          Fixed smalls bug in calculation of T for AC/HAC - wasn't using the last ob
*          in QS kernel, and didn't take account of possible dropped observations
*  2.0.07b Fixed macro bug that truncated long varlists
*  2.0.07c Added dof option.
*          Changed display of RMSE so that more digits are displayed (was %8.1g)
*          Fixed small bug where cstat was local macro and should have been scalar
*          Fixed bug where C stat failed with cluster.  NB: wmatrix option and cluster are not compatible!
*  2.0.7d  Fixed bug in dof option
*  2.1.0   Added first-stage identification, weak instruments, and redundancy stats
*  2.1.01  Tidying up cue option checks, reporting of cue in output header, etc.
*  2.1.02  Used Poskitt-Skeels (2002) result that C-D eval = cceval / (1-cceval)
*  2.1.03  Added saved lists of separate included and excluded exogenous IVs
*  2.1.04  Added Anderson-Rubin test of signif of endog regressors
*  2.1.05  Fix minor bugs relating to cluster and new first-stage stats
*  2.1.06  Fix bug in cue: capture estimates hold without corresponding capture on estimates unhold
*  2.1.07  Minor fix to ereturn local wexp, promote to version 8.2
*  2.1.08  Added dofminus option, removed dof option.  Added A-R test p-values to e().
*          Minor bug fix to A-R chi2 test - was N chi2, should have been N-L chi2.
*          Changed output to remove potentially misleading refs to N-L etc.
*          Bug fix to rhs count - sometimes regressors could have exact zero coeffs
*          Bug fix related to cluster - if user omitted -robust-, orthog would use Sargan and not J
*          Changed output of Shea R2 to make clearer that F and p-values do not refer to it
*          Improved handling of collinearites to check across inexog, exexog and endo lists
*          Total weight statement moved to follow summ command
*          Added traps to catch errors if no room to save temporary estimations with _est hold
*          Added -savefirst- option. Removed -hascons-, now synonymous with -nocons-.
*  2.1.09  Fixes to dof option with cluster so it no longer mimics incorrect areg behavior
*          Local ivreg2_cmd to allow testing under name ivreg2
*          If wmatrix supplied, used (previously not used if non-robust sargan stat generated)
*          Allowed OLS using (=) syntax (empty endo and exexog lists)
*          Clarified error message when S matrix is not of full rank
*          cdchi2p, ardf, ardf_r added to saved macros
*          first and ffirst replay() options; DispFirst and DispFFirst separately codes 1st stage output
*          Added savefprefix, macro with saved first-stage equation names.
*          Added version option.
*          Added check for duplicate variables to collinearity checks
*          Rewrote/simplified Godfrey-Shea partial r2 code
* 2.1.10   Added NOOUTput option
*          Fixed rf bug so that first does not trigger unnecessary saved rf
*          Fixed cue bug - was not starting with robust 2-step gmm if robust/cluster
* 2.1.11   Dropped incorrect/misleading dofminus adjustments in first-stage output summary
* 2.1.12   Collinearity check now checks across inexog/exexog/endog simultaneously
* 2.1.13   Added check to catch failed first-stage regressions
*          Fixed misleading failed C-stat message
* 2.1.14   Fixed mishandling of missing values in AC (non-robust) block
* 2.1.15   Fixed bug in RF - was ignoring weights
*          Added -endog- option
*          Save W matrix for all cases; ensured copy is posted with wmatrix option so original isn't zapped
*          Fixed cue bug - with robust, was entering IV block and overwriting correct VCV
* 2.1.16   Added -fwl- option
*          Saved S is now robust cov matrix of orthog conditions if robust, whereas W is possibly non-robust
*          weighting matrix used by estmator.  inv(S)=W if estimator is efficient GMM.
*          Removed pscore option (dropped by official ivreg).
*          Fixed bug where -post- would fail because of missing values in vcv
*          Remove hascons as synonym for nocons
*          OLS now outputs 2nd footer with variable lists
* 2.1.17   Reorganization of code
*          Added ll() macro
*          Fixed N bug where weights meant a non-integer ob count that was rounded down
*          Fixed -fwl- option so it correctly handles weights (must include when partialling-out)
*          smatrix option takes over from wmatrix option.  Consistent treatment of both.
*          Saved smatrix and wmatrix now differ in case of inefficient GMM.
*          Added title() and subtitle() options.
*          b0 option returns a value for the Sargan/J stat even if exactly id'd.
*          (Useful for S-stat = value of GMM objective function.)
*          HAC and AC now allowed with LIML and k-class.
*          Collinearity improvements: bug fixed because collinearity was mistakenly checked across
*          inexog/exexog/endog simultaneously; endog predicted exactly by IVs => reclassified as inexog;
*          _rmcollright enforces inexog>endo>exexog priority for collinearities, if Stata 9.2 or later.
*          K-class, LIML now report Sargan and J.  C-stat based on Sargan/J.  LIML reports AR if homosked.
*          nb: can always easily get a C-stat for LIML based on diff of two AR stats.
*          Always save Sargan-Hansen as e(j); also save as e(sargan) if homoskedastic.
*          Added Stock-Watson robust SEs options sw()
* 2.1.18   Added Cragg-Donald-Stock-Yogo weak ID statistic critical values to main output
*          Save exexog_ct, inexog_ct and endog_ct as macros
*          Stock-Watson robust SEs now assume ivar is group variable
*          Option -sw- is standard SW.  Option -swpsd- is PSD version a la page 6 point 10.
*          Added -noid- option.  Suppresses all first-stage and identification statistics.
*          Internal calls to ivreg2 use noid option.
*          Added hyperlinks to ivreg2.hlp and helpfile argument to display routines to enable this.
* 2.1.19   Added matrix rearrangement and checks for smatrix and wmatrix options
*          Recursive calls to cstat simplified - no matrix rearrangement or separate robust/nonrobust needed
*          Reintroduced weak ID stats to ffirst output
*          Added robust ID stats to ffirst output for case of single endogenous regressor
*          Fixed obscure bug in reporting 1st stage partial r2 - would report zero if no included exogenous vars
*          Removed "HOLS" in main output (misleading if, e.g., estimation is AC but not HAC)
*          Removed "ML" in main output if no endogenous regressors - now all ML is labelled LIML
*          model=gmm is now model=gmm2s; wmatrix estimation is model=gmm
*          wmatrix relates to gmm estimator; smatrix relates to gmm var-cov matrix; b0 behavior equiv to wmatrix
*          b0 option implies nooutput and noid options
*          Added nocollin option to skip collinearity checks
*          Fixed minor display bug in ffirst output for endog vars with varnames > 12 characters
*          Fixed bug in saved rf and first-stage results for vars with long varnames; uses permname
*          Fixed bug in model df - had counted RHS, now calculates rank(V) since latter may be rank-deficient
*          Rank of V now saved as macro rankV
*          fwl() now allows partialling-out of just constant with _cons
*          Added Stock-Wright S statistic (but adds overhead - calls preserve)
*          Properties now include svyj.
*          Noted only: fwl bug doesn't allow time-series operators.
* 2.1.20   Fixed Stock-Wright S stat bug - didn't allow time-series operators
* 2.1.21   Fixed Stock-Wright S stat to allow for no exog regressors cases
* 2.2.00   CUE partials out exog regressors, estimates endog coeffs, then exog regressors separately - faster
*          gmm2s becomes standard option, gmm supported as legacy option
* 2.2.01   Added explanatory messages if gmm2s used.
*          States if estimates efficient for/stats consistent for het, AC, etc.
*          Fixed small bug that prevented "{help `helpfile'##fwl:fwl}" from displaying when -capture-d.
*          Error message in footer about insuff rank of S changed to warning message with more informative message.
*          Fixed bug in CUE with weights.
* 2.2.02   Removed CUE partialling-out; still available with fwl
*          smatrix and wmatrix become documented options. e(model)="gmmw" means GMM with arbitrary W
* 2.2.03   Fixed bug in AC with aweights; was weighting zi'zi but not ei'ei.
* 2.2.04   Added abw code for bw(), removed properties(svyj)
* 2.2.05   Fixed bug in AC; need to clear variable vt1 at start of loop
*          If iweights, N (#obs with precision) rounded to nearest integer to mimic official Stata treatment
*          and therefore don't need N scalar at all - will be same as N
*          Saves fwl_ct as macro.
*          -ffirst- output, weak id stat, etc. now adjust for number of partialled-out variables.
*          Related changes: df_m, df_r include adjustments for partialled-out variables.
*          Option nofwlsmall introduced - suppresses above adjustments.  Undocumented in ivreg2.hlp.
*          Replaced ID tests based on canon corr with Kleibergen-Paap rk-based stats if not homoskedastic
*          Replaced LR ID test stats with LM test stats.
*          Checks that -ranktest- is installed.
* 2.2.06   Fixed bug with missing F df when cue called; updated required version of ranktest
* 2.2.07   Modified redundancy test statistic to match standard regression-based LM tests
*          Change name of -fwl- option to -partial-.
*          Use of b0 means e(model)=CUE.  Added informative b0 option titles. b0 generates output but noid.
*          Removed check for integer bandwidth if auto option used.
* 2.2.08   Add -nocollin- to internal calls and to -ivreg2_cue- to speed performance.
* 2.2.09   Per msg from Brian Poi, Alastair Hall verifies that Newey-West cited constant of 1.1447
*          is correct. Corrected mata abw() function. Require -ranktest- 1.1.03.
* 2.2.10   Added Angrist-Pischke multivariate f stats.  Rewrite of first and ffirst output.
*          Added Cragg-Donald to weak ID output even when non-iid.
*          Fixed small bug in non-robust HAC code whereby extra obs could be used even if dep var missing.
*             (required addition of  L`tau'.(`s1resid') in creation of second touse variable)
*          Fixed bugs that zapped varnames with "_cons" in them
*          Changed tvar and ivar setup so that data must be tsset or xtset.
*          Fixed bug in redundancy test stat when called by xtivreg2+cluster - no dofminus adj needed in this case
*          Changed reporting so that gaps between panels are not reported as such.
*          Added check that weight variable is not transformed by partialling out.
*          Changed Stock-Wright S statistic so that it uses straight partialling-out of exog regressors
*            (had been, in effect, doing 2SGMM partialling-out)
*          Fixed bug where dropped collinear endogenous didn't get a warning or listing
*          Removed N*CDEV Wald chi-sq statistic from ffirst output (LM stat enough)
* 3.0.00   Fully rewritten and Mata-ized code.  Require min Stata 10.1 and ranktest 1.2.00.
*          Mata support for Stock-Watson SEs for fixed effects estimator; doesn't support fweights.
*          Changed handling of iweights yielding non-integer N so that (unlike official -regress-) all calcs
*          for RMSE etc. use non-integer N and N is rounded down only at the end.
*          Added support for Thompson/Cameron-Gelbach-Miller 2-level cluster-robust vcvs.
* 3.0.01   Now exits more gracefully if no regressors survive after collinearity checks
* 3.0.02   -capture- instead of -qui- before reduced form to suppress not-full-rank error warning
*          Modified Stock-Wright code to partial out all incl Xs first, to reduce possibility of not-full-rank
*          omega and missing sstat.  Added check within Stock-Wright code to catch not-full-rank omega.
*          Fixed bug where detailed first-stage stats with cluster were disrupted if data had been tsset
*          using a different variables.
*          Fixed bug that didn't allow regression on just a constant.
*          Added trap for no observations.
*          Added trap for auto bw with panel data - not allowed.
* 3.0.03   Fixed bug in m_omega that always used Stock-Watson spectral decomp to create invertible shat
*          instead of only when (undocumented) spsd option is called.
*          Fixed bug where, if matsize too small, exited with wrong error (mistakenly detected as collinearities)
*          Removed inefficient call to -ranktest- that unnecessarily requested stats for all ranks, not just full.
* 3.0.04   Fixed coding error in m_omega for cluster+kernel.  Was *vcvo.e[tmatrix[.,1]], should have been (*vcvo.e)[tmatrix[.,1]].
*          Fixed bug whereby clusters defined by strings were not handled correctly.
*          Updated ranktest version check
* 3.0.05   Added check to catch unwanted transformations of time or panel variables by partial option.

[email protected]

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index