Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: ivreg2 with weight: results slightly different to those 3 months ago

From	Nick Cox <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	st: RE: ivreg2 with weight: results slightly different to those 3 months ago
Date	Mon, 18 Oct 2010 18:12:13 +0100

Look inside the code for a log (copied at the end of this post). This probably deserves some kind of prize for documentation of changes.

Nick
[email protected]

Steve

I have been re-running a do file that I last ran in July and I have been
getting slightly different results for most parts of the regression output
when using -ivreg2- with weights (-ivreg2- without weights generates
identical results). For example, the Kleibergen-Paap rk LM statistic and
Kleibergen-Paap Wald rk F statistic are the same as they were in July but
most of the output changes slightly (including the regression coefficients).

I am now using:

. which ivreg2
c:\ado\plus\i\ivreg2.ado
*! ivreg2 3.0.05 17June2010

I guess that I must have updated the ado file sometime between July and now
because today's log file includes Angrist-Pischke (AP) statistics but the
July log file does not (it first reports Shea's Partial R2). I do not know
which version of ivreg2 I was using previously. Sorry!

In both cases I am estimating the same model:

ivreg2 lallcancersdsmr lneed89HPandG (lg2_89expphHPandG=llonepenhx
lpoppucarx) [w=rawpop89], gmm2s robust ffirst endog(lg2_89expphHPandG)

Have changes been made to -ivreg2- (perhaps fairly recently) which would
account for this?

********************************** VERSION COMMENTS **********************************
* Initial version cloned from official ivreg version 5.0.9 19Dec2001
* 1.0.2: add logic for reg3. Sargan test
* 1.0.3: add prunelist to ensure that count of excluded exogeneous is correct
* 1.0.4: revise option to exog(), allow included exog to be specified as well
* 1.0.5: switch from reg3 to regress, many options and output changes
* 1.0.6: fixed treatment of nocons in Sargan and C-stat, and corrected problems
* relating to use of nocons combined with a constant as an IV
* 1.0.7: first option reports F-test of excluded exogenous; prunelist bug fix
* 1.0.8: dropped prunelist and switched to housekeeping of variable lists
* 1.0.9: added collinearity checks; C-stat calculated with recursive call;
* added ffirst option to report only F-test of excluded exogenous
* from 1st stage regressions
* 1.0.10: 1st stage regressions also report partial R2 of excluded exogenous
* 1.0.11: complete rewrite of collinearity approach - no longer uses calls to
* _rmcoll, does not track specific variables dropped; prunelist removed
* 1.0.12: reorganised display code and saved results to enable -replay()-
* 1.0.13: -robust- and -cluster- now imply -small-
* 1.0.14: fixed hascons bug; removed ivreg predict fn (it didn't work); allowed
* robust and cluster with z stats and correct dofs
* 1.0.15: implemented robust Sargan stat; changed to only F-stat, removed chi-sq;
* removed exog option (only orthog works)
* 1.0.16: added clusterised Sargan stat; robust Sargan handles collinearities;
* predict now works with standard SE options plus resids; fixed orthog()
* so it accepts time series operators etc.
* 1.0.17: fixed handling of weights. fw, aw, pw & iw all accepted.
* 1.0.18: fixed bug in robust Sargan code relating to time series variables.
* 1.0.19: fixed bugs in reporting ranks of X'X and Z'Z
* fixed bug in reporting presence of constant
* 1.0.20: added GMM option and replaced robust Sargan with (equivalent) J;
* added saved statistics of 1st stage regressions
* 1.0.21: added Cragg HOLS estimator, including allowing empty endog list;
* -regress- syntax now not allowed; revised code searching for "_cons"
* 1.0.22: modified cluster output message; fixed bug in replay for Sargan/Hansen stat;
* exactly identified Sargan/Hansen now exactly zero and p-value not saved as e();
* cluster multiplier changed to 1 (from buggy multiplier), in keeping with
* eg Wooldridge 2002 p. 193.
* 1.0.23: fixed orthog option to prevent abort when restricted equation is underid.
* 1.0.24: fixed bug if 1st stage regressions yielded missing values for saving in e().
* 1.0.25: Added Shea version of partial R2
* 1.0.26: Replaced Shea algorithm with Godfrey algorithm
* 1.0.27: Main call to regress is OLS form if OLS or HOLS is specified; error variance
* in Sargan and C statistics use small-sample adjustment if -small- option is
* specified; dfn of S matrix now correctly divided by sample size
* 1.0.28: HAC covariance estimation implemented
* Symmetrize all matrices before calling syminv
* Added hack to catch F stats that ought to be missing but actually have a
* huge-but-not-missing value
* Fixed dof of F-stat - was using rank of ZZ, should have used rank of XX (couldn't use df_r
* because it isn't always saved. This is because saving df_r triggers small stats
* (t and F) even when -post- is called without dof() option, hence df_r saved only
* with -small- option and hence a separate saved macro Fdf2 is needed.
* Added rankS to saved macros
* Fixed trap for "no regressors specified"
* Added trap to catch gmm option with no excluded instruments
* Allow OLS syntax (no endog or excluded IVs specified)
* Fixed error messages and traps for rank-deficient robust cov matrix; includes
* singleton dummy possibility
* Capture error if posting estimated VCV that isn't pos def and report slightly
* more informative error message
* Checks 3 variable lists (endo, inexog, exexog) separately for collinearities
* Added AC (autocorrelation-consistent but conditionally-homoskedastic) option
* Sargan no longer has small-sample correction if -small- option
* robust, cluster, AC, HAC all passed on to first-stage F-stat
* bw must be < T
* 1.0.29 -orthog- also displays Hansen-Sargan of unrestricted equation
* Fixed collinearity check to include nocons as well as hascons
* Fixed small bug in Godfrey-Shea code - macros were global rather than local
* Fixed larger bug in Godfrey-Shea code - was using mixture of sigma-squares from IV and OLS
* with and without small-sample corrections
* Added liml and kclass
* 1.0.30 Changed order of insts macro to match saved matrices S and W
* 2.0.00 Collinearities no longer -qui-
* List of instruments tested in -orthog- option prettified
* 2.0.01 Fixed handling of nocons with no included exogenous, including LIML code
* 2.0.02 Allow C-test if unrestricted equation is just-identified. Implemented by
* saving Hansen-Sargan dof as = 0 in e() if just-identified.
* 2.0.03 Added score() option per latest revision to official ivreg
* 2.0.04 Changed score() option to pscore() per new official ivreg
* 2.0.05 Fixed est hold bug in first-stage regressions
* Fixed F-stat finite sample adjustment with cluster option to match official Stata
* Fixed F-stat so that it works with hascons (collinearity with constant is removed)
* Fixed bug in F-stat code - wasn't handling failed posting of vcv
* No longer allows/ignores nonsense options
* 2.0.06 Modified lsStop to sync with official ivreg 5.1.3
* 2.0.07a Working version of CUE option
* Added sortpreserve, ivar and tvar options
* Fixed smalls bug in calculation of T for AC/HAC - wasn't using the last ob
* in QS kernel, and didn't take account of possible dropped observations
* 2.0.07b Fixed macro bug that truncated long varlists
* 2.0.07c Added dof option.
* Changed display of RMSE so that more digits are displayed (was %8.1g)
* Fixed small bug where cstat was local macro and should have been scalar
* Fixed bug where C stat failed with cluster. NB: wmatrix option and cluster are not compatible!
* 2.0.7d Fixed bug in dof option
* 2.1.0 Added first-stage identification, weak instruments, and redundancy stats
* 2.1.01 Tidying up cue option checks, reporting of cue in output header, etc.
* 2.1.02 Used Poskitt-Skeels (2002) result that C-D eval = cceval / (1-cceval)
* 2.1.03 Added saved lists of separate included and excluded exogenous IVs
* 2.1.04 Added Anderson-Rubin test of signif of endog regressors
* 2.1.05 Fix minor bugs relating to cluster and new first-stage stats
* 2.1.06 Fix bug in cue: capture estimates hold without corresponding capture on estimates unhold
* 2.1.07 Minor fix to ereturn local wexp, promote to version 8.2
* 2.1.08 Added dofminus option, removed dof option. Added A-R test p-values to e().
* Minor bug fix to A-R chi2 test - was N chi2, should have been N-L chi2.
* Changed output to remove potentially misleading refs to N-L etc.
* Bug fix to rhs count - sometimes regressors could have exact zero coeffs
* Bug fix related to cluster - if user omitted -robust-, orthog would use Sargan and not J
* Changed output of Shea R2 to make clearer that F and p-values do not refer to it
* Improved handling of collinearites to check across inexog, exexog and endo lists
* Total weight statement moved to follow summ command
* Added traps to catch errors if no room to save temporary estimations with _est hold
* Added -savefirst- option. Removed -hascons-, now synonymous with -nocons-.
* 2.1.09 Fixes to dof option with cluster so it no longer mimics incorrect areg behavior
* Local ivreg2_cmd to allow testing under name ivreg2
* If wmatrix supplied, used (previously not used if non-robust sargan stat generated)
* Allowed OLS using (=) syntax (empty endo and exexog lists)
* Clarified error message when S matrix is not of full rank
* cdchi2p, ardf, ardf_r added to saved macros
* first and ffirst replay() options; DispFirst and DispFFirst separately codes 1st stage output
* Added savefprefix, macro with saved first-stage equation names.
* Added version option.
* Added check for duplicate variables to collinearity checks
* Rewrote/simplified Godfrey-Shea partial r2 code
* 2.1.10 Added NOOUTput option
* Fixed rf bug so that first does not trigger unnecessary saved rf
* Fixed cue bug - was not starting with robust 2-step gmm if robust/cluster
* 2.1.11 Dropped incorrect/misleading dofminus adjustments in first-stage output summary
* 2.1.12 Collinearity check now checks across inexog/exexog/endog simultaneously
* 2.1.13 Added check to catch failed first-stage regressions
* Fixed misleading failed C-stat message
* 2.1.14 Fixed mishandling of missing values in AC (non-robust) block
* 2.1.15 Fixed bug in RF - was ignoring weights
* Added -endog- option
* Save W matrix for all cases; ensured copy is posted with wmatrix option so original isn't zapped
* Fixed cue bug - with robust, was entering IV block and overwriting correct VCV
* 2.1.16 Added -fwl- option
* Saved S is now robust cov matrix of orthog conditions if robust, whereas W is possibly non-robust
* weighting matrix used by estmator. inv(S)=W if estimator is efficient GMM.
* Removed pscore option (dropped by official ivreg).
* Fixed bug where -post- would fail because of missing values in vcv
* Remove hascons as synonym for nocons
* OLS now outputs 2nd footer with variable lists
* 2.1.17 Reorganization of code
* Added ll() macro
* Fixed N bug where weights meant a non-integer ob count that was rounded down
* Fixed -fwl- option so it correctly handles weights (must include when partialling-out)
* smatrix option takes over from wmatrix option. Consistent treatment of both.
* Saved smatrix and wmatrix now differ in case of inefficient GMM.
* Added title() and subtitle() options.
* b0 option returns a value for the Sargan/J stat even if exactly id'd.
* (Useful for S-stat = value of GMM objective function.)
* HAC and AC now allowed with LIML and k-class.
* Collinearity improvements: bug fixed because collinearity was mistakenly checked across
* inexog/exexog/endog simultaneously; endog predicted exactly by IVs => reclassified as inexog;
* _rmcollright enforces inexog>endo>exexog priority for collinearities, if Stata 9.2 or later.
* K-class, LIML now report Sargan and J. C-stat based on Sargan/J. LIML reports AR if homosked.
* nb: can always easily get a C-stat for LIML based on diff of two AR stats.
* Always save Sargan-Hansen as e(j); also save as e(sargan) if homoskedastic.
* Added Stock-Watson robust SEs options sw()
* 2.1.18 Added Cragg-Donald-Stock-Yogo weak ID statistic critical values to main output
* Save exexog_ct, inexog_ct and endog_ct as macros
* Stock-Watson robust SEs now assume ivar is group variable
* Option -sw- is standard SW. Option -swpsd- is PSD version a la page 6 point 10.
* Added -noid- option. Suppresses all first-stage and identification statistics.
* Internal calls to ivreg2 use noid option.
* Added hyperlinks to ivreg2.hlp and helpfile argument to display routines to enable this.
* 2.1.19 Added matrix rearrangement and checks for smatrix and wmatrix options
* Recursive calls to cstat simplified - no matrix rearrangement or separate robust/nonrobust needed
* Reintroduced weak ID stats to ffirst output
* Added robust ID stats to ffirst output for case of single endogenous regressor
* Fixed obscure bug in reporting 1st stage partial r2 - would report zero if no included exogenous vars
* Removed "HOLS" in main output (misleading if, e.g., estimation is AC but not HAC)
* Removed "ML" in main output if no endogenous regressors - now all ML is labelled LIML
* model=gmm is now model=gmm2s; wmatrix estimation is model=gmm
* wmatrix relates to gmm estimator; smatrix relates to gmm var-cov matrix; b0 behavior equiv to wmatrix
* b0 option implies nooutput and noid options
* Added nocollin option to skip collinearity checks
* Fixed minor display bug in ffirst output for endog vars with varnames > 12 characters
* Fixed bug in saved rf and first-stage results for vars with long varnames; uses permname
* Fixed bug in model df - had counted RHS, now calculates rank(V) since latter may be rank-deficient
* Rank of V now saved as macro rankV
* fwl() now allows partialling-out of just constant with _cons
* Added Stock-Wright S statistic (but adds overhead - calls preserve)
* Properties now include svyj.
* Noted only: fwl bug doesn't allow time-series operators.
* 2.1.20 Fixed Stock-Wright S stat bug - didn't allow time-series operators
* 2.1.21 Fixed Stock-Wright S stat to allow for no exog regressors cases
* 2.2.00 CUE partials out exog regressors, estimates endog coeffs, then exog regressors separately - faster
* gmm2s becomes standard option, gmm supported as legacy option
* 2.2.01 Added explanatory messages if gmm2s used.
* States if estimates efficient for/stats consistent for het, AC, etc.
* Fixed small bug that prevented "{help `helpfile'##fwl:fwl}" from displaying when -capture-d.
* Error message in footer about insuff rank of S changed to warning message with more informative message.
* Fixed bug in CUE with weights.
* 2.2.02 Removed CUE partialling-out; still available with fwl
* smatrix and wmatrix become documented options. e(model)="gmmw" means GMM with arbitrary W
* 2.2.03 Fixed bug in AC with aweights; was weighting zi'zi but not ei'ei.
* 2.2.04 Added abw code for bw(), removed properties(svyj)
* 2.2.05 Fixed bug in AC; need to clear variable vt1 at start of loop
* If iweights, N (#obs with precision) rounded to nearest integer to mimic official Stata treatment
* and therefore don't need N scalar at all - will be same as N
* Saves fwl_ct as macro.
* -ffirst- output, weak id stat, etc. now adjust for number of partialled-out variables.
* Related changes: df_m, df_r include adjustments for partialled-out variables.
* Option nofwlsmall introduced - suppresses above adjustments. Undocumented in ivreg2.hlp.
* Replaced ID tests based on canon corr with Kleibergen-Paap rk-based stats if not homoskedastic
* Replaced LR ID test stats with LM test stats.
* Checks that -ranktest- is installed.
* 2.2.06 Fixed bug with missing F df when cue called; updated required version of ranktest
* 2.2.07 Modified redundancy test statistic to match standard regression-based LM tests
* Change name of -fwl- option to -partial-.
* Use of b0 means e(model)=CUE. Added informative b0 option titles. b0 generates output but noid.
* Removed check for integer bandwidth if auto option used.
* 2.2.08 Add -nocollin- to internal calls and to -ivreg2_cue- to speed performance.
* 2.2.09 Per msg from Brian Poi, Alastair Hall verifies that Newey-West cited constant of 1.1447
* is correct. Corrected mata abw() function. Require -ranktest- 1.1.03.
* 2.2.10 Added Angrist-Pischke multivariate f stats. Rewrite of first and ffirst output.
* Added Cragg-Donald to weak ID output even when non-iid.
* Fixed small bug in non-robust HAC code whereby extra obs could be used even if dep var missing.
* (required addition of L`tau'.(`s1resid') in creation of second touse variable)
* Fixed bugs that zapped varnames with "_cons" in them
* Changed tvar and ivar setup so that data must be tsset or xtset.
* Fixed bug in redundancy test stat when called by xtivreg2+cluster - no dofminus adj needed in this case
* Changed reporting so that gaps between panels are not reported as such.
* Added check that weight variable is not transformed by partialling out.
* Changed Stock-Wright S statistic so that it uses straight partialling-out of exog regressors
* (had been, in effect, doing 2SGMM partialling-out)
* Fixed bug where dropped collinear endogenous didn't get a warning or listing
* Removed N*CDEV Wald chi-sq statistic from ffirst output (LM stat enough)
* 3.0.00 Fully rewritten and Mata-ized code. Require min Stata 10.1 and ranktest 1.2.00.
* Mata support for Stock-Watson SEs for fixed effects estimator; doesn't support fweights.
* Changed handling of iweights yielding non-integer N so that (unlike official -regress-) all calcs
* for RMSE etc. use non-integer N and N is rounded down only at the end.
* Added support for Thompson/Cameron-Gelbach-Miller 2-level cluster-robust vcvs.
* 3.0.01 Now exits more gracefully if no regressors survive after collinearity checks
* 3.0.02 -capture- instead of -qui- before reduced form to suppress not-full-rank error warning
* Modified Stock-Wright code to partial out all incl Xs first, to reduce possibility of not-full-rank
* omega and missing sstat. Added check within Stock-Wright code to catch not-full-rank omega.
* Fixed bug where detailed first-stage stats with cluster were disrupted if data had been tsset
* using a different variables.
* Fixed bug that didn't allow regression on just a constant.
* Added trap for no observations.
* Added trap for auto bw with panel data - not allowed.
* 3.0.03 Fixed bug in m_omega that always used Stock-Watson spectral decomp to create invertible shat
* instead of only when (undocumented) spsd option is called.
* Fixed bug where, if matsize too small, exited with wrong error (mistakenly detected as collinearities)
* Removed inefficient call to -ranktest- that unnecessarily requested stats for all ranks, not just full.
* 3.0.04 Fixed coding error in m_omega for cluster+kernel. Was *vcvo.e[tmatrix[.,1]], should have been (*vcvo.e)[tmatrix[.,1]].
* Fixed bug whereby clusters defined by strings were not handled correctly.
* Updated ranktest version check
* 3.0.05 Added check to catch unwanted transformations of time or panel variables by partial option.

Nick
[email protected]

*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: RE: ivreg2 with weight: results slightly different to those 3 months ago
  - From: "sdm1" <[email protected]>

References:
- st: ivreg2 with weight: results slightly different to those 3 months ago
  - From: "sdm1" <[email protected]>

Prev by Date: st: ivreg2 with weight: results slightly different to those 3 months ago
Next by Date: st: xtmixed and complex survey data
Previous by thread: st: ivreg2 with weight: results slightly different to those 3 months ago
Next by thread: st: RE: RE: ivreg2 with weight: results slightly different to those 3 months ago
Index(es):
- Date
- Thread