Mata is a full-blown programming language that compiles what you type into byte-code, optimizes it, and executes it fast. Behind the scenes, many of the new features of Stata 9, such as linear mixed models and multinomial probit, were written in Mata. You can use Mata to implement big systems, or you can use it interactively.To enter Mata, type

mataat Stata’s dot prompt. To exit, typeendat Mata’s colon prompt:

. mata mata (type end to exit) : sqrt(-4) . : sqrt(-4+0i) 2i : endMata supports real and complex numbers, binary and text strings (up to 2,147,483,647 characters long), and, for serious programming problems, even pointers!

Mata uses LAPACK routines for its advanced matrix features, such as Cholesky decomposition, LU decomposition, QR decomposition, SV decomposition, eigenvalues and eigenvectors, and solvers and inverters.

Mata supports matrices that are views onto, not copies of, the data. Say you have loaded a dataset of 200,000 observations and 150 variables, and you need a matrix of 80 of those variables in 180,000 of the observations. Rather than requiring 110 megabytes, Mata needs only 640 bytes.

Everybody knows that matrix languages evaluate matrix expressions, such as

b=invsym(X'X)*X'y, and Mata is no exception. Because of Mata’s design, however, it is fast enough to work at the element level. Here is Mata’s polynomial solver:Much of Mata is written in Mata.

numeric rowvector polysolve(numeric vector y, numeric vector x) { numeric rowvector res, c, empty real scalar i, j, n if (cols(y) != cols(x) | rows(y) != rows(x)) _error(3200) if ((n=length(x)) == 0) _error(3200) res = (iscomplex(y) | iscomplex(x) ? 0i : 0) for (j=1; j<=n; j++) { c = (1) for (i=1; i<=n; i++) { if (i != j) { c = polymult(c, (-x[i],1) :/ (x[j]-x[i])) } } res = polyadd(res, y[j] :* c) } while (res[cols(res)]==0) res = res[|1,1 \ 1,cols(res)-1|] return(res) }

Here is how we introduce Mata in the
**Mata Reference Manual**.

[M-1] first — Introduction and first session

Mata is a component of Stata. It is a matrix programming language that can be used interactively or as an extension for do-files and ado-files. ThusMata has something for everybody.

- Mata can be used by users who want to think in matrix terms and perform (not necessarily simple) matrix calculations interactively, and
- Mata can be used by advanced Stata programmers who want to add features to Stata.
Primary features of Mata are that it is fast and that it is C-like.

This introduction is presented under the headings

Invoking Mata

Using Mata

Making mistakes: interpreting error messages

Working with real numbers, complex numbers, and strings

Working with scalars, vectors, and matrices

Working with functions

Distinguishing real and complex values

Working with matrix and scalar functions

Performing element-by-element calculations: colon operators

Writing programs

More functions

Mata environment commands

Exiting Mata

To enter Mata, typemataat Stata’s dot prompt and press enter; to exit Mata, typeendat Mata’s colon prompt:

.mata<- typematato enter Mata mata (typeendto exit) :2+2<- type Mata statements at the 4 colon prompt :end<- typeendto return to Stata _________________________________ . _ <- you are back to Stata

When you type a statement into Mata, Mata compiles what you typed and, if it compiled without error, executes it:

:2+24 : _We typed

2+2, a particular example from the general class of expressions. Mata responded with 4, the evaluation of the expression.Often what you type are expressions, although you will probably choose more complicated examples. When an expression is not assigned to a variable, the result of the expression is displayed. Assignment is performed by the

=operator:When we type

:x = 2 + 2:x4 : _x=2+2, the expression is evaluated and stored in the variable we just namedx. The result is not displayed. We can look at the contents ofx, however, simply by typingx. From Mata’s perspective,xis not only a variable, it is also an expression, albeit a rather simple one. Just as2+2says to load 2, load another 2, and add them, the expressionxsays to loadxand stop there.As an aside, Mata distinguishes uppercase and lowercase.

Xis not the same asx:

:X = 2 + 3:x4 :X5

If you make a mistake, Mata complains, and then you continue on your way. For instance,

:2,,3invalid expressionr(3000); : _

2,,3makes no sense to Mata, so Mata complained. This is an example of what is called a compile-time error; Mata could not make sense out of what we typed.The other kind of error is called a run-time error. For example, we have no variable called

y. Let us ask Mata to show us the contents ofy:In this case, what we typed made perfect sense—show me

:y<istmt>: 3499 y not foundr(3499); : _y—butyhas never been defined. This ugly message is called an run-time error message—see[M-2] errorsfor a complete description—but all that’s important is to understand the difference betweenandinvalid expressionThe run-time message is prefixed by an identity (<istmt>: 3499 y not found<istmt>in this case) and a number (3499 in this case). Mata is telling us, "I was executing youristmt[that’s what everything you type is called] and I got error 3499, the details of which are that I was unable to findy."The compile-time error message is of a simpler form: "invalid expression". When you get such unprefixed error messages, that means Mata could not understand what you typed. When you get the more complicated error message, that means Mata understood what you typed, but there was a problem performing your request.

Another way to tell the difference between compile-time errors and run-time errors is to look at the return code. Compile-time errors have a return code of 3000:

Run-time errors have a return code that might be in the 3000s, but is never 3000 exactly:

:2,,3invalid expressionr(3000);Whether the error is compile-time or run-time, once the error message is issued, Mata is ready to continue just as if the error never happened.

:y<istmt>: 3499 y not foundr(3499);

As we have seen, Mata works with real numbers:Mata also understands complex numbers; you write the imaginary part by suffixing a lowercase

:2+35i:For imaginary numbers, you can omit the real part:

:1+2i + 4-1i5+1iWhether a number is real or complex, you can use the same computer notation for the imaginary part as you would for the real part:

:1+2i - 2i1We purposely wrote the last example in nearly unreadable form just to emphasize that Mata could interpret it.

:2.5e+3i2500i :1.25e+2+2.5e+3i/* i.e., 1.25e+02 + 2.5e+03i */ 125 + 2500iMata also understands strings, which you write enclosed in double quotes:

:"Alpha"+"Beta"AlphaBetaJust like Stata, Mata understands simple and compound double quotes:

You can add complex and reals

:`"Alpha"'+`"Beta"'AlphaBetabut you may not add reals or complex to strings:

:1+2i + 34+2iWe got a run-time error. Mata understood

:2 + "alpha"<istmt>: 3250 type mismatch;r(3250);2 + "alpha"alright, it just could not perform our request.

In addition to understanding scalars—be they real, complex, or string — Mata understands vectors and matrices of real, complex, and string elements:

:x=(1,2):x1 2 +---------+ 1 | 1 2 | +---------+xnow contains the row vector (1,2). We can add vectors:The "

:x + (3,4)1 2 +---------+ 1 | 4 6 | +---------+," is the column-join operator; things like(1,2)are expressions, just as(1+2)is an expression:In the above, we could have dispensed with the parentheses and typed "

:y = (3,4):z = (x,y):z1 2 3 4 +-----------------+ 1 | 1 2 3 4 | +-----------------+y=3,4" followed by "z=x,y", just as we could using the+operator, although most people find vectors more readable when enclosed in parentheses. "\" is the row-join operator:

:a = (1\2):a1 +-----+ 1 | 1 | 2 | 2 | +-----+ :b = (3\4):c = (a\b):c1 +-----+ 1 | 1 | 2 | 2 | 3 | 3 | 4 | 4 | +-----+Using the column-join and row-join operators, we can enter matrices:

The use of these operators is not limited to scalars. Remember,

:A = (1,2 \ 3,4):A1 2 +---------+ 1 | 1 2 | 2 | 3 4 | +---------+xis the row vector (1,2),yis the row vector (3,4),ais the column vector (1\2), andbis the column vector (3\4). Therefore,

:x\y1 2 +---------+ 1 | 1 2 | 2 | 3 4 | +---------+ :a,b+---------+ | 1 3 | | 2 4 | +---------+But if we try something nonsensical, we get an error:

:a,x<istmt>: 3200 nonconformable matricesWe create complex vectors and matrices just as we create real ones, the only difference being that their elements are complex:

:Z = (1+1i, 2+3i \ 3-2i , -1-1i):Z1 2 +---------------------+ 1 | 1 + 1i 2 + 3i | 2 | 3 - 2i -1 - 1i | +---------------------+Similarly, we can create string vectors and matrices, which are vectors and matrices with string elements:

In the case of strings, the individual elements can be up to 2,147,437,647 characters long.

:S = ("1st element", "2nd element" \ "another row", "last element"):S1 2 +-------------------------------+ 1 | 1st element 2nd element | 2 | another row last element | +-------------------------------+

Mata’s expressions also include functions:

:sqrt(4)2 :sqrt(-4).When we ask for the square root of -4, Mata replies "

." Further, note that.can be stored just like any other number:"

:findout = sqrt(-4):findout.." means missing, that there is no answer to our calculation. Taking the square root of a negative number is not an error; it merely produces missing. To Mata, missing is a number like any other number, and the rules for all the operators have been generalized to understand missing. For instance, the addition rule is generalized such that missing plus anything is missing:Still, it should surprise you that Mata produced missing for the

:2 + ..sqrt(-4). We said that Mata understands complex numbers, so should not the answer be 2i? The answer is that is should be if you are working on the complex plane, but otherwise, missing is probably a better answer. Mata attempts to intuit the kind of answer you want by context, and in particular, uses inheritance rules. If you ask for the square root of a real number, you get a real number back. If you ask for the square root of a complex number, you get a complex number back:Here complex means multipart:

:sqrt(-4+0i)2i-4+0iis a complex number, it merely happens to have 0 imaginary part. Thus:If you ever have a real scalar, vector, or matrix, and want to make it complex, use the

:areal = -4:acomplex = -4 + 0i:sqrt(areal). :sqrt(acomplex)2iC()function, which means "convert to complex":

:sqrt(C(areal))2iC()is documented in[M-5] C().C()allows one or two arguments. With one argument, it casts to complex. With two arguments, it makes a complex out of the two real arguments. Thus you could typeor you could type

:sqrt(-4+2i).485868272 + 2.05817103i

:sqrt(C(-4,2)).485868272 + 2.05817103iBy the way, used with one argument,

C()also allows complex, and then it does nothing:

:sqrt(C(acomplex))2i

It is virtually impossible to tell the difference between a real value and a complex value with zero imaginary part:

:areal = -4:acomplex = -4 + 0i:areal-4 :acomplex-4Yet, as we have seen, the difference is important:

sqrt(areal)is missing,sqrt(acomplex)is -2i. One solution is theeltype()function:

:eltype(areal)real :eltype(acomplex)complex

eltype()can also be used with stringsbut this is mostly useful in programming contexts.

: astring = "hello" :eltype(astring)string

Some functions are matrix functions: they require a matrix and return a matrix. Mata’sinvsym(is an example of such a function. It returns the matrix that is the inverse of symmetric, real matrixX):X

:X = (76, 53, 48 \ 53, 88, 46 \ 48, 46, 63):Xi = invsym(X):Xi[symmetric] 1 2 3 +----------------------------------------------+ 1 | .0298458083 | 2 | -.0098470272 .0216268926 | 3 | -.0155497706 -.0082885675 .0337724301 | +----------------------------------------------+ :Xi*X1 2 3 +----------------------------------------------+ 1 | 1 -8.67362e-17 -8.50015e-17 | 2 | -1.38778e-16 1 -1.02349e-16 | 3 | 0 1.11022e-16 1 | +----------------------------------------------+

The last matrix,

Xi*X, differs just a little from the identity matrix due to unavoidable computational roundoff error.Other functions are, mathematically speaking, scalar functions.

sqrt()is an example in that it makes no sense to speak ofsqrt(. (That is, it makes no sense to speak ofX)sqrt(unless we were speaking of the Cholesky square-root decomposition. Mata has such a matrix function; see helpX)[M-5] cholesky().)When a function is, mathematically speaking, a scalar function, the corresponding Mata function will usually allow vector and matrix arguments, and in that case, the Mata function makes the calculation on each element individually:

:M = (1,2 \ 3,4 \ 5,6):M1 2 +---------+ 1 | 1 2 | 2 | 3 4 | 3 | 5 6 | +---------+ :S = sqrt(M):S1 2 +-----------------------------+ 1 | 1 1.414213562 | 2 | 1.732050808 2 | 3 | 2.236067977 2.449489743 | +-----------------------------+ :S[1,2]*S[1,2]2 :S[2,1]*S[2,1]3

When a function returns a result calculated in this way, it is said to return an element-by-element result.

Mata’s operators, such as+(addition) and*(multiplication), work as you would expect. In particular,*performs matrix multiplication:The first element of the result was calculated as 1*5+2*7=19.

:A = (1, 2 \ 3, 4):B = (5, 6 \ 7, 8):A*B1 2 +-----------+ 1 | 19 22 | 2 | 43 50 | +-----------+Sometimes, you really want the element-by-element result. When you do, place a colon in front of the operator: Mata’s

:*operator performs element-by-element multiplication:See

:A:*B1 2 +-----------+ 1 | 5 12 | 2 | 21 32 | +-----------+[M-2] op_colonfor more information.

Mata is a complete programming language; it will allow you to create your own functions::That single statement creates a new function, although perhaps you would prefer if we typed it asfunction add(a,b) return(a+b)

:function add(a, b)>{>return(a+b)>}

because that makes it obvious that a program can contain many lines. In either case, once defined, we can use the function:

:add(1,2)3 :add(1+2i,4-1i)5+1i :add( (1,2), (3,4) )1 2 +---------+ 1 | 4 6 | +---------+ :add(x,y)1 2 +---------+ 1 | 4 6 | +---------+ :add(A,A)1 2 +---------+ 1 | 2 4 | 2 | 6 8 | +---------+ :Z1 = (1+1i, 1+1i \ 2, 2i):Z2 = (1+2i, -3+3i \ 6i, -2+2i):add(Z1, Z2)1 2 +---------------------+ 1 | 2 + 3i -2 + 4i | 2 | 2 + 6i -2 + 4i | +---------------------+ :add("Alpha","Beta")AlphaBeta :S1 = ("one", "two" \ "three", "four"):S2 = ("abc", "def" \ "ghi", "jkl"):add(S1, S2)1 2 +-----------------------+ 1 | oneabc twodef | 2 | threeghi fourjkl | +-----------------------+

Of course, our little function

add()does not do anything that the+operator does not already do, but we could write a program that did do something different. The following program will allow us to makenxnidentity matrices:The function

:real matrix id (real scalar n)>{>real scalar i>real matrix res> >res = J(n, n, 0)>for (i=1; i<=n; i++) {>res[i,i] = 1>}>return(res)>}:I3 = id(3):I3[symmetric] 1 2 3 +-------------+ 1 | 1 | 2 | 0 1 | 3 | 0 0 1 | +-------------+J()in the program lineres = J(n, n, 0)is a Mata built-in function that returns annxnmatrix containing 0s (J(r,c,val)returns anr x cmatrix, the elements of which are all equal toval); see[M-5] J().for (i=1; i<=n; i++)says that starting with i=1 and so long as i<=n do what is inside the braces (setres[i,i]equal to 1) and then (we are back to theforpart again), increment i. The final line—return(res)—says to return the matrix we have just created. Actually, just as withadd(), we do not needid()because Mata has a built-in functionI(that makes identity matrices, but it is interesting to see how the problem could be programmed.n)

Mata has lots of functions already and much of this manual concerns documenting what those functions do; see[M-4] intro. But right now, what is important is that many of the functions are themselves written in Mata!One of those functions is

pi();it takes no arguments and returns the value ofpi. The code for it readsThere is no reason to type the above function because it is already included as part of Mata:real scalar pi() return(3.141592653589793238462643):When Mata lists a result, it does not show as many digits, but we could ask to see more:pi()3.141592654:Other Mata functions include the hyperbolic functionsprintf("%17.0g", pi())3.14159265358979sinh(u),cosh(u), etc. The code forsinh(u),cosh(u), andtanh(u)reads

numeric matrix sinh(numeric matrix u) return((exp(u)-exp(-u)):/2)numeric matrix cosh(numeric matrix u) return((exp(u)+exp(-u)):/2)numeric matrix tanh(numeric matrix u) { numeric matrix eu, emu eu = exp(u) emu = exp(-u) return( (eu-emu):/(eu+emu) ) }See for yourself: at the Stata dot prompt (not the Mata colon prompt), type

.viewsource sinh.mata.viewsource cosh.mata.viewsource tanh.mataWhen the code for a function was written in Mata (as opposed to having been written in C),

viewsourcecan show you the code; see[M-1] source.Returning to the functions themselves,

numeric matrix sinh(numeric matrix u) return((exp(u)-exp(-u)):/2)numeric matrix cosh(numeric matrix u) return((exp(u)+exp(-u)):/2)numeric matrix tanh(numeric matrix u){numeric matrix eu, emueu = exp(u)emu = exp(-u)return( (eu-emu):/(eu+emu) )}

this is the first time we have seen the word

numeric: it means real or complex. Built-in (previously written) functionexp()works likesqrt()in that it allows a real or complex argument and correspondingly returns a real or complex result. Said in Mata jargon,exp()allows anumericargument and correspondingly returns anumericresult.sinh(),cosh(), andtanh()will also work likesqrt()andexp().Another characteristic

sinh(),cosh()andtanh()share withsqrt()andexp()is element-by-element operation.sinh(),cosh(), andtanh()are element-by-element becauseexp()is element-by-element and because we were careful to use the:/(element-by-element) divide operator.In any case, there is no need to type the above functions because they are already part of Mata. You could learn more about them by seeing their manual entry,

[M-5] sin().At the other extreme, Mata functions can become quite long. Here is Mata’s function to solve

AX=BforXwhenAis lower triangular, placing the resultXback intoA:

real scalar _solvelower( numeric matrix A, numeric matrix b, |real scalar usertol, numeric scalar userd) { real p; calar tol, rank, a_t, b_t, d_t real scalar n, m, i, im1, complex_case numeric rowvector sum numeric scalar zero, d d = userd if ((n=rows(A))!=cols(A)) _error(3205) if (n != rows(b)) _error(3200) if (isview(b)) _error(3104) m = cols(b) rank = n a_t = iscomplex(A) b_t = iscomplex(b) d_t = d<. ? iscomplex(d) : 0complex_case = a_t | b_t | d_t if (complex_case) { if (!a_t) A = C(A) if (!b_t) b = C(b) if (d<. & !d_t) d = C(d) zero = 0i } else zero = 0 if (n==0 | m==0) return(0) tol = solve_tol(A, usertol) if (abs(d) >=. ) { if (abs(d=A[1,1])<=tol) { b[1,.] = J(1, m, zero) --rank } else { b[1,.] = b[1,.] :/ d if (missing(d)) rank = . } for (i=2; i<=n; i++) { im1 = i - 1 sum = A[|i,1\i,im1|] * b[|1,1\im1,m|] if (abs(d=A[i,i])<=tol) { b[i,.] = J(1, m, zero) --rank } else { b[i,.] = (b[i,.]-sum) :/ d if (missing(d)) rank = . } } } else { if (abs(d)<=tol) { rank = 0 b = J(rows(b), cols(b), zero) } else { b[1,.] = b[1,.] :/ d for (i=2; i<=n; i++) { im1 = i - 1 sum = A[|i,1\i,im1|] * b[|1,1\im1,m|] b[i,.] = (b[i,.]-sum) :/ d } } } return(rank) }

If the function were not already part of Mata and you wanted to use it, you could type it into a do-file or onto the end of an ado-file (especially good if you just want to use

_solvelower()as a subroutine). In those cases, do not forget to enter and exit Mata:

top of fileprogrammycommand...ado-file code appears here...endmata:_solvelower() code appears hereend bottom of fileSharp-eyed readers will notice that we put a colon on the end of the Mata command. That’s a detail and why we did that is explained in

[M-3] mata.In addition to loading functions by putting their code in do- and ado-files, you can also save the compiled versions of functions in

.mofiles (see[M-3] mata mosave) or into.mlibMata libraries (see[M-3] mata mlib).In the case of

_solvelower(), it has already been saved into a library, namely Mata’s official library, so you need not do any of this.

When you are using Mata, there is a set of commands that will tell you about and manipulate Mata’s environment.The most useful such command is

mata describe:

:mata describe# bytes type name and extent 34 transmorphic matrix add() 98 real matrix id() 32 real matrix A[2,2] 32 real matrix B[2,2] 72 real matrix I3[3,3] 48 real matrix M[3,2] 48 real matrix S[3,2] 47 string matrix S1[2,2] 44 string matrix S2[2,2] 72 real matrix X[3,3] 72 real matrix Xi[3,3] 64 complex matrix Z[2,2] 64 complex matrix Z1[2,2] 64 complex matrix Z2[2,2] 16 real colvector a[2] 16 complex scalar acomplex 8 real scalar areal 16 real colvector b[2] 32 real colvector c[4] 8 real scalar findout 16 real rowvector x[2] 16 real rowvector y[2] 32 real rowvector z[4] : _Another useful command is

mata clear, which will clear Mata without disturbing Stata:There are other useful

:mata clear:mata describe# bytes type name and extentmatacommands; see[M-3] intro. Do not confuse this commandmata, which you type at Mata’s colon prompt with Stata’s commandmata, which you type at Stata’s dot prompt and which invokes Mata.

When you are done using Mata, typeendto Mata’s colon prompt:

:end . _Exiting Mata does not clear it:

.mata mata (type end to exit) :x = 2:y = (3+2i):function add(a,b) return(a+b):end . ... .mata mata (type end to exit) :mata describe# bytes type name and extent 34 transmorphic matrix add() 8 real scalar x 16 complex scalar y :endExiting Stata clears Mata, as does Stata’s

clearcommand; see[R] drop.

Also seeManual:[M-1] firstOnline:[M-0] intro,[M-1] intro