Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Which inverter? (was: RE: RE: ivreg2 weak-id statistic and quadratic terms)


From   Richard Gates <rgates@stata.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Which inverter? (was: RE: RE: ivreg2 weak-id statistic and quadratic terms)
Date   Tue, 21 Feb 2012 14:08:27 -0600

Mark Schaffer recently inquired about the best matrix inverter to use
in Mata.  He stated: 

> Hi all.  I have traced the problem to the choice of inverter.  At least,
> it's definitely the problem in the auto dataset example below, and I'll
> bet it's the source of Miroslav's problem as well.
> 
> In most places, -ivreg2- and -ranktest- use Mata's general-purpose
> invsym() inverter.  In a few places they use Mata's QR decomposition
> qrsolve().  It is where the latter is used that things are going wrong.
> In the simple example I constructed with the auto data, qrsolve() has
> problems but invsym() does not.
> 
> This is an interesting question.  What's the best choice of inverter
> when faced with scaling problems?

He provides Mata code to demonstrate the problem he is having with svsolve()
and qrsolve(), both detecting a nonfull rank-situation due to scaling.  He uses
the moment matrices to compute the least squares solution X*b = y.  That is, 

> beta_sv=svsolve(XX,Xy)
> 
> beta_qr=qrsolve(XX,Xy)

where

> XX=quadcross(X,X)
> Xy=quadcross(X,y)

Computing the cross product of a matrix increases the chances of numerical
problems when inverting it.  In both cases the rank returned by svsolve() and
qrsolve() is 3, although there are 4 regressors in X.

Setting the optional tolerance to tol = 1e-5, yields a full-rank matrix.  This
tolerance is a scale for the default tolerance.  See -help mf_qrsolve- for
details.

However, both svsolve() and qrsolve() are designed to work with X and y
directly, thus avoiding the instabilities inherent in forming cross-product
matrices.  Both syminv() and lusolve(), designed to handle square matrices, use
extended precision numerics to get around this problem.

Using

beta_svd = svsolve(X,y,rank)
beta_qr = qrsolve(X,y,rank)

the solution is full rank, without using quad precision.  Incidentally,
it is a good idea to use the optional argument rank handle nonfull-rank
situations appropriately.

Below demonstrates:


. do qr1 tol(1e-5)

. cscript 
-------------------------------------------------------------------------BEGIN 

. 
. local 0, `0'

. syntax, [ tol(real 1) ]

. 
. cap log close

. log using qr.log, replace
-------------------------------------------------------------------------------
      name:  <unnamed>
       log:  /home/rbg/tech/qr/qr.log
  log type:  text
 opened on:  21 Feb 2012, 11:17:17

. 
. sysuse auto, clear
(1978 Automobile Data)

. gen double weightsq=weight^2

. 
. * Rescaled to reduce scaling problems
. gen double weight1=weight/1000

. gen double weight1sq=(weight/1000)^2

. 
. * -regress- benchmark
. reg mpg turn weight weightsq

      Source |       SS       df       MS              Number of obs =      74
-------------+------------------------------           F(  3,    70) =   48.59
       Model |  1650.78591     3  550.261969           Prob > F      =  0.0000
    Residual |  792.673553    70  11.3239079           R-squared     =  0.6756
-------------+------------------------------           Adj R-squared =  0.6617
       Total |  2443.45946    73  33.4720474           Root MSE      =  3.3651

------------------------------------------------------------------------------
         mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        turn |   -.148733   .1741053    -0.85   0.396    -.4959751    .1985091
      weight |   -.013562    .003953    -3.43   0.001     -.021446    -.005678
    weightsq |   1.34e-06   6.27e-07     2.14   0.036     9.35e-08    2.60e-06
       _cons |   55.08176    7.36366     7.48   0.000     40.39541    69.76812
------------------------------------------------------------------------------

. mat list e(b)

e(b)[1,4]
          turn      weight    weightsq       _cons
y1    -.148733  -.01356202   1.345e-06   55.081765

. qui reg mpg turn weight1 weight1sq

. mat list e(b)

e(b)[1,4]
          turn     weight1   weight1sq       _cons
y1    -.148733  -13.562021   1.3448538   55.081765

. 
. putmata y=mpg X=(turn weight weightsq 1) X1=(turn weight1 weight1sq 1),
(1 vector, 2 matrices posted)

. 
. mata:
------------------------------------------------- mata (type end to exit) -----
: 
: Xy = cross(X,y)

: Xy1 = cross(X1,y)

: XX = cross(X,X)

: XX1 = cross(X1,X1)

: 
: beta_inv=invsym(XX)*Xy

: beta_inv1=invsym(XX1)*Xy1

: beta_inv, beta_inv1
                  1              2
    +-------------------------------+
  1 |  -.1487330027   -.1487330027  |
  2 |  -.0135620214    -13.5620214  |
  3 |   1.34485e-06    1.344853823  |
  4 |   55.08176484    55.08176484  |
    +-------------------------------+

: 
: beta_lu=lusolve(XX,Xy)

: beta_lu1=lusolve(XX1,Xy1)

: beta_lu, beta_lu1
                  1              2
    +-------------------------------+
  1 |  -.1487330027   -.1487330027  |
  2 |  -.0135620214    -13.5620214  |
  3 |   1.34485e-06    1.344853823  |
  4 |   55.08176484    55.08176484  |
    +-------------------------------+

: 
: 
: beta_sv=svsolve(XX,Xy,rank=.,`tol')

: beta_sv1=svsolve(XX1,Xy1)

: rank
  4

: beta_sv, beta_sv1
                  1              2
    +-------------------------------+
  1 |  -.1487330122   -.1487330027  |
  2 |  -.0135620213    -13.5620214  |
  3 |   1.34485e-06    1.344853823  |
  4 |   55.08176508    55.08176484  |
    +-------------------------------+

: 
: mreldif(beta_sv,beta_inv)
  8.29376e-09

: 
: beta_qr=qrsolve(XX,Xy,rank=.,`tol')

: beta_qr1=qrsolve(XX1,Xy1)

: rank
  4

: beta_qr, beta_qr1
                  1              2
    +-------------------------------+
  1 |  -.1487329972   -.1487330027  |
  2 |  -.0135620214    -13.5620214  |
  3 |   1.34485e-06    1.344853823  |
  4 |   55.08176469    55.08176484  |
    +-------------------------------+

: 
: mreldif(beta_qr,beta_inv)
  4.80526e-09

: 
: beta_qr = qrsolve(X,y,rank)

: assert(rank==4)

: beta_sv = svsolve(X,y,rank)

: assert(rank==4)

: (beta_qr,beta_sv)
                  1              2
    +-------------------------------+
  1 |  -.1487330027   -.1487330027  |
  2 |  -.0135620214   -.0135620214  |
  3 |   1.34485e-06    1.34485e-06  |
  4 |   55.08176484    55.08176484  |
    +-------------------------------+

: 
: mreldif(beta_qr,beta_inv)
  2.54662e-14

: mreldif(beta_sv,beta_inv)
  8.38611e-13

: 
: end
-------------------------------------------------------------------------------

. cap log close

. exit

end of do-file


I hope this helps.

-Rich
rgates@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index