    # RE: st: manipulating matrix elements

 From "Nick Cox" To Subject RE: st: manipulating matrix elements Date Wed, 29 Mar 2006 15:36:28 +0100

```Bill kindly provided the definitive answer.

A totally separate issue from how best to
code this in Stata is why Steve Vaisey
wants in his calculation of entropy to add
ln(#rows * #columns).

I am so familiar with the definition SUM p ln p
that I unconsciously filtered out the additive term in my
various solutions. Where does the extra term
come from? Could Steve or anyone else provide
a reference for this definition please?

Nick
n.j.cox@durham.ac.uk

William Gould, Stata

>
> > I am trying to implement the following formula for estimating the
> > (standardized) informational entropy of a RxC contingency table:
> > sum(p*ln(p))+ln(M) where p (subscript ij omitted) is the
> proportion of
> > cases in each cell and M is the total number of cells.  So
> far, I've
> > only managed to accomplish the following:
> >
> >       . tab var1 var2, matcell(Cell)
> >       . matrix Proportions = Cell/r(N)
>
> There have already been lots of answers to this question.  I
> particularly
> like the answer by Nick Cox <n.j.cox@durham.ac.uk> because he
> used Mata,
> but Nick packaged his solution into an ado-file with a Mata
> subroutine,
> which disguised how easy the solution is.
>
> The entire solution to the problem is
>
>         . tab va1 var2, matcell(Cell)
>         . mata:
>         : M = st_numscalar("r(N)")
>         : X = st_matrix("Cell")
>         : P = X/M
>         : sum(P:*ln(P))+ln(M)
>         : end
>
> You can just type the above interactively into Stata.  It will work.
>
> In Steve's original question, he wrote
>
> > For some (probably good) reason, while you can easily
> multiply matrix
> > elements by a scalar, you can't do something like:
> >
> >       . matrix LnP = ln(Proportions)
>
> That's true, you can't do that in Stata's *OLD* matrix
> language, but you
> can do that in Stata's *NEW* matrix language, Mata.
>
> In the code above, I just did that in-line,
>
>         : sum(P:*ln(P))+ln(M)
>
> but I just as well could have typed:
>
>         : LnP = ln(P)
>         : sum(P:*LnP)+ln(M)
>
> In fact, that is exactly what I did when I tried this
> problem, and I typed
> even more so that I could see intermediate results and make
> sure things were
> going well.  Here's the actual log:
>
> ==============================================================
> ================
> . tab var1 var2, matcell(Cell)
>
>     Repair |
>     Record |               var2
>       1978 |         1          2          3 |     Total
> -----------+---------------------------------+----------
>          1 |         5          4          1 |        10
>          2 |         8         15          7 |        30
>          3 |         2         10          6 |        18
>          4 |         3          6          2 |        11
> -----------+---------------------------------+----------
>      Total |        18         35         16 |        69
>
>
> . mata:
> ------------------------------------------------- mata (type
> end to exit) ----------
> : M = st_numscalar("r(N)")
>
> : M
>   69
>
> : X = st_matrix("Cell")
>
> : X
>         1    2    3
>     +----------------+
>   1 |   5    4    1  |
>   2 |   8   15    7  |
>   3 |   2   10    6  |
>   4 |   3    6    2  |
>     +----------------+
>
> : P = X/M
>
> : P
>                  1             2             3
>     +-------------------------------------------+
>   1 |  .0724637681   .0579710145   .0144927536  |
>   2 |   .115942029   .2173913043   .1014492754  |
>   3 |  .0289855072   .1449275362   .0869565217  |
>   4 |  .0434782609   .0869565217   .0289855072  |
>     +-------------------------------------------+
>
> : lnP = ln(P)
>
> : lnP
>                   1              2              3
>     +----------------------------------------------+
>   1 |  -2.624668592   -2.847812143   -4.234106505  |
>   2 |  -2.154664963   -1.526056303   -2.288196356  |
>   3 |  -3.540959324   -1.931521412   -2.442347035  |
>   4 |  -3.135494216   -2.442347035   -3.540959324  |
>     +----------------------------------------------+
>
> : sum(P:*lnP)+ln(M)
>   1.957469762
>
> : end
> ==============================================================
> ================
>
>
> By the way, the strange ":*" in -sum(P:*lnP)+ln(M)- is Mata's
> elementwise
> multiplication operator.  P*lnP would be matrix multiplication.

