# st: Matrix Algebra

 From Alfonso MIranda To statalist@hsphsun2.harvard.edu Subject st: Matrix Algebra Date Mon, 7 Oct 2002 10:00:17 +0100 (BST)

```Note: Apologies if this mail comes out twice.

Dear Statalisters,

I am trying to estimate a count model with endogenous switching as
proposed by Terza(1998). It involves the use of a two-step method of
moments estimator. First stage is done using a probit and second stage is
done using non-linear least squares. I basically have coded all but the
correction for the covariance matrix. For calculating such a matrix I have
to create an intermediate matrix that has the following general form:

Y = A'*W*A

A is nxk matrix and W is a nxn diagonal matrix with regression's
squared errors in the diagonal and zeros elsewhere. Lets say that I have
variables a1 a2 a3 forming matrix A. And that the squared errors are saved
as variable res2. I have more than 20,000 observations in my dataset.

Since it is not possible to create matrices of 20,000x20,000 in Stata, I
was very kindly suggested by Michael Blasnik to use the weighting feature
of matrix accum for calculating matrix Y. Basically he suggested to use:

.mat accum a1 a2 a3 [aw=res], noc

In order to check that this solution is correct I drop observations after
estimating my model, and residuals, and kept only 200 observations. Then,
as also kindly suggested by Nick Cox, I calculate matrix A and W in the
following way:

.mkmat a1 a2 a3, matrix(A)
.local n = _N
.matrix W = J(`n',`n',0)
.forval i=1/`n' { W[`i',`i']=res2[`i'] }
.matrix mymat = A'*W*A
.matrix list mymat

.symmetric mymat[3,3]
a1          a2         a3
a1  62054.362
a2  60504.697  60504.697
a3  1004.4707  1004.4707  1004.4707

This matrix is what I want but with large data it cannot be calculated
using Nick's suggestion. Now, using the weighting feature of matrix accum

.matrix accum H = a1 a2 a3 [aw=res2], noc
(sum of wgt is   2.4974e+03)
(obs=200)

.matrix list H

symmetric H[3,3]
a1         a2          a3
a1  4969.5487
a2  4845.4456  4845.4456
a3  80.441821  80.441821  80.441821

Clearly, mymat and H are different. Jiang, Tao very kindly suggested an
alternative which would be:

.sca m=10000
.matrix accum Z= a1 a2 a3 [fw=res2+m], noc
.matrix accum Z2= a1 a2 a3, noc
.matrix Z=Z1-m*Z2

however, since res2 is not a integer number, frequency weights cannot be
estimated. I did what Jiang, Tao suggested using analytic weigths:

sca m=10000
matrix accum Z1=   g1c g1cat g1ind [aw=res2+m], noc
(sum of wgt is   2.0002e+07)
(obs=200)

matrix accum Z2= a1 a2 a3, noc
(obs=200)
matrix Z=Z1-m*Z2

matrix list Z

symmetric Z[3,3]
a1          a2           a3
a1   -4.670e+08
a2   -4.438e+08  -4.438e+08
a3   -4512856     -4512856     -4512856

Which is also different to maymat. It seems then that all suggested
strategies do not yield the matrix that I need. Does anyone has other
idea?

Many thanks,

Alfonso Miranda
University of Warwick

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```