Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: Which inverter? (was: RE: RE: ivreg2 weak-id statistic and quadratic terms) |

Date |
Tue, 21 Feb 2012 00:11:09 -0000 |

Hi all. I have traced the problem to the choice of inverter. At least, it's definitely the problem in the auto dataset example below, and I'll bet it's the source of Miroslav's problem as well. In most places, -ivreg2- and -ranktest- use Mata's general-purpose invsym() inverter. In a few places they use Mata's QR decomposition qrsolve(). It is where the latter is used that things are going wrong. In the simple example I constructed with the auto data, qrsolve() has problems but invsym() does not. This is an interesting question. What's the best choice of inverter when faced with scaling problems? Below is some simple Mata code and output corresponding to a regression with the toy auto data set when the variables create scaling problems. Stata's built-in -regress- is the benchmark. In a nutshell: invsym() reproduces the -regress- results. lusolve() reproduces the -regress- results. svsolve() runs into problems - without rescaling it goes badly wrong. qrsolve() runs into problems - without rescaling it goes badly wrong. I've had a look at the discussions in the manual, and I didn't spot anything there that would explain this. Would someone who knows more about numerical computing than me care to comment? --Mark Stata code: ********************************************************************* sysuse auto, clear gen double weightsq=weight^2 * Rescaled to reduce scaling problems gen double weight1=weight/1000 gen double weight1sq=(weight/1000)^2 * -regress- benchmark qui reg mpg turn weight weightsq mat list e(b) qui reg mpg turn weight1 weight1sq mat list e(b) putmata y=mpg X=(turn weight weightsq 1) X1=(turn weight1 weight1sq 1), replace mata: XX=quadcross(X,X) Xy=quadcross(X,y) XX1=quadcross(X1,X1) Xy1=quadcross(X1,y) "Comparing beta hat for (1) unscaled and (2) scaled data" beta_inv=invsym(XX)*Xy beta_inv1=invsym(XX1)*Xy1 beta_inv, beta_inv1 beta_lu=lusolve(XX,Xy) beta_lu1=lusolve(XX1,Xy1) beta_lu, beta_lu1 beta_sv=svsolve(XX,Xy) beta_sv1=svsolve(XX1,Xy1) beta_sv, beta_sv1 beta_qr=qrsolve(XX,Xy) beta_qr1=qrsolve(XX1,Xy1) beta_qr, beta_qr1 end ********************************************************************* Output (using Stata 11.2) ********************************************************************* . sysuse auto, clear (1978 Automobile Data) . gen double weightsq=weight^2 . . * Rescaled to reduce scaling problems . gen double weight1=weight/1000 . gen double weight1sq=(weight/1000)^2 . . * -regress- benchmark . qui reg mpg turn weight weightsq . mat list e(b) e(b)[1,4] turn weight weightsq _cons y1 -.148733 -.01356202 1.345e-06 55.081765 . qui reg mpg turn weight1 weight1sq . mat list e(b) e(b)[1,4] turn weight1 weight1sq _cons y1 -.148733 -13.562021 1.3448538 55.081765 . . putmata y=mpg X=(turn weight weightsq 1) X1=(turn weight1 weight1sq 1), replace (1 vector, 2 matrices posted) . . mata: ------------------------------------------------- mata (type end to exit) ---------------------------------- : : XX=quadcross(X,X) : Xy=quadcross(X,y) : XX1=quadcross(X1,X1) : Xy1=quadcross(X1,y) : : "Comparing beta hat for (1) unscaled and (2) scaled data" Comparing beta hat for (1) unscaled and (2) scaled data : : beta_inv=invsym(XX)*Xy : beta_inv1=invsym(XX1)*Xy1 : beta_inv, beta_inv1 1 2 +-------------------------------+ 1 | -.1487330027 -.1487330027 | 2 | -.0135620214 -13.5620214 | 3 | 1.34485e-06 1.344853823 | 4 | 55.08176484 55.08176484 | +-------------------------------+ : : beta_lu=lusolve(XX,Xy) : beta_lu1=lusolve(XX1,Xy1) : beta_lu, beta_lu1 1 2 +-------------------------------+ 1 | -.1487330027 -.1487330027 | 2 | -.0135620214 -13.5620214 | 3 | 1.34485e-06 1.344853823 | 4 | 55.08176484 55.08176484 | +-------------------------------+ : : beta_sv=svsolve(XX,Xy) : beta_sv1=svsolve(XX1,Xy1) : beta_sv, beta_sv1 1 2 +-------------------------------+ 1 | .6585549729 -.1487330027 | 2 | .0057662425 -13.5620214 | 3 | -2.30511e-06 1.344853823 | 4 | .0096556129 55.08176484 | +-------------------------------+ : : beta_qr=qrsolve(XX,Xy) : beta_qr1=qrsolve(XX1,Xy1) : beta_qr, beta_qr1 1 2 +-------------------------------+ 1 | .6586963952 -.1487330027 | 2 | .0057696338 -13.5620214 | 3 | -2.30575e-06 1.344853823 | 4 | 0 55.08176484 | +-------------------------------+ : : end ********************************************************************* > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of > Schaffer, Mark E > Sent: 20 February 2012 22:36 > To: statalist@hsphsun2.harvard.edu > Subject: st: RE: ivreg2 weak-id statistic and quadratic terms > > Hi Miroslav, hi all. > > I've checked this with the toy auto dataset. I can replicate > this behaviour. > > Miroslav - either before or after rescaling your covariates, > do the estimated coefficients vary hugely in scale? > > In my toy auto dataset example, I am pretty sure that it is > driven by scaling problems. For example, after > > sysuse auto, clear > gen double weight2=weight^2 > ivreg2 price (mpg=turn) weight weight2 > > gives a large weak ID stat of 11.5. But there are big > scaling problems in the first-stage and main estimations, > with coeffs that are something like a factor of 10^8 > different in magnitude. > > If I estimate and just partial out the constant, > > ivreg2 price (mpg=turn) weight weight2, partial(_cons) > > the ill-conditioning is less pronounced and I get a weak ID > stat of 0.73. > > If I partial out all the exogenous covariates, > > ivreg2 price (mpg=turn) weight weight2, partial(weight weight2) > > the ill-conditioning is gone and I again get a weak ID stat of 0.73. > > I will investigate further and will report back to the list > if I find anything more. It may be that -ivreg2- could > handle these cases more robustly. > > --Mark (ivreg2 coauthor) > > > -----Original Message----- > > From: owner-statalist@hsphsun2.harvard.edu > > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Miros Lav > > Sent: 20 February 2012 21:25 > > To: statalist@hsphsun2.harvard.edu > > Subject: st: ivreg2 weak-id statistic and quadratic terms > > > > Dear all, > > > > I am using ivreg2 to estimate a model where a control > variable enters > > with a quadratic term. A simplified version of the command is as > > follows > > > > ivreg2 y (a=instrument) x x^2, r cluster(id). > > > > Estimating this model results in a very large > Kleinbergen-Paap weak-id > > F statistic. > > > > However, generating z=x/1000 and z^2=z*z and estimating the model > > > > ivreg2 y (a=instrument) z z^2, r cluster(id) > > > > results in a very low Kleinbergen-Paap weak-id F statistic. > > > > (The z-statistics and significance levels in the first and second > > stage regressions are the same as in the previous model.) > > > > Does anyone have an idea why these two equivalent models result in > > very different Kleinbergen-Paap weak-id F statistic? > > > > Thanks for your help! > > > > Miroslav > > * > > * For searches and help try: > > * http://www.stata.com/help.cgi?search > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > > > -- > Heriot-Watt University is a Scottish charity registered under > charity number SC000278. > > Heriot-Watt University is the Sunday Times Scottish > University of the Year 2011-2012 > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Heriot-Watt University is a Scottish charity registered under charity number SC000278. Heriot-Watt University is the Sunday Times Scottish University of the Year 2011-2012 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Which inverter? (was: RE: RE: ivreg2 weak-id statistic and quadratic terms)***From:*Scott Merryman <scott.merryman@gmail.com>

**st: RE: Which inverter? (was: RE: RE: ivreg2 weak-id statistic and quadratic terms)***From:*"Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk>

**References**:**st: ivreg2 weak-id statistic and quadratic terms***From:*Miros Lav <ranjit.rosof.b189@googlemail.com>

**st: RE: ivreg2 weak-id statistic and quadratic terms***From:*"Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk>

- Prev by Date:
**Re: st: RE: calculating proportions** - Next by Date:
**Re: st: RE: statalist-digest V4 #4433** - Previous by thread:
**st: RE: ivreg2 weak-id statistic and quadratic terms** - Next by thread:
**st: RE: Which inverter? (was: RE: RE: ivreg2 weak-id statistic and quadratic terms)** - Index(es):