Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: manipulating matrix elements


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: manipulating matrix elements
Date   Wed, 29 Mar 2006 15:36:28 +0100

Bill kindly provided the definitive answer. 

A totally separate issue from how best to 
code this in Stata is why Steve Vaisey 
wants in his calculation of entropy to add
ln(#rows * #columns). 

I am so familiar with the definition SUM p ln p
that I unconsciously filtered out the additive term in my 
various solutions. Where does the extra term
come from? Could Steve or anyone else provide 
a reference for this definition please? 

Nick 
n.j.cox@durham.ac.uk 

William Gould, Stata
 
> <vaisey@unc.edu> asked, 
> 
> > I am trying to implement the following formula for estimating the 
> > (standardized) informational entropy of a RxC contingency table: 
> > sum(p*ln(p))+ln(M) where p (subscript ij omitted) is the 
> proportion of 
> > cases in each cell and M is the total number of cells.  So 
> far, I've 
> > only managed to accomplish the following:
> > 
> >       . tab var1 var2, matcell(Cell)
> >       . matrix Proportions = Cell/r(N)
> 
> There have already been lots of answers to this question.  I 
> particularly 
> like the answer by Nick Cox <n.j.cox@durham.ac.uk> because he 
> used Mata, 
> but Nick packaged his solution into an ado-file with a Mata 
> subroutine,
> which disguised how easy the solution is.
> 
> The entire solution to the problem is
> 
>         . tab va1 var2, matcell(Cell)
>         . mata:
>         : M = st_numscalar("r(N)")
>         : X = st_matrix("Cell")
>         : P = X/M
>         : sum(P:*ln(P))+ln(M)
>         : end
> 
> You can just type the above interactively into Stata.  It will work.
> 
> In Steve's original question, he wrote 
> 
> > For some (probably good) reason, while you can easily 
> multiply matrix
> > elements by a scalar, you can't do something like:
> >
> >       . matrix LnP = ln(Proportions)
> 
> That's true, you can't do that in Stata's *OLD* matrix 
> language, but you
> can do that in Stata's *NEW* matrix language, Mata.
> 
> In the code above, I just did that in-line,
> 
>         : sum(P:*ln(P))+ln(M)
> 
> but I just as well could have typed:
> 
>         : LnP = ln(P)
>         : sum(P:*LnP)+ln(M)
> 
> In fact, that is exactly what I did when I tried this 
> problem, and I typed 
> even more so that I could see intermediate results and make 
> sure things were
> going well.  Here's the actual log:
> 
> ==============================================================
> ================
> . tab var1 var2, matcell(Cell)
> 
>     Repair |
>     Record |               var2
>       1978 |         1          2          3 |     Total
> -----------+---------------------------------+----------
>          1 |         5          4          1 |        10 
>          2 |         8         15          7 |        30 
>          3 |         2         10          6 |        18 
>          4 |         3          6          2 |        11 
> -----------+---------------------------------+----------
>      Total |        18         35         16 |        69 
> 
> 
> . mata:
> ------------------------------------------------- mata (type 
> end to exit) ----------
> : M = st_numscalar("r(N)")
> 
> : M
>   69
> 
> : X = st_matrix("Cell")
> 
> : X
>         1    2    3
>     +----------------+
>   1 |   5    4    1  |
>   2 |   8   15    7  |
>   3 |   2   10    6  |
>   4 |   3    6    2  |
>     +----------------+
> 
> : P = X/M
> 
> : P
>                  1             2             3
>     +-------------------------------------------+
>   1 |  .0724637681   .0579710145   .0144927536  |
>   2 |   .115942029   .2173913043   .1014492754  |
>   3 |  .0289855072   .1449275362   .0869565217  |
>   4 |  .0434782609   .0869565217   .0289855072  |
>     +-------------------------------------------+
> 
> : lnP = ln(P)
> 
> : lnP
>                   1              2              3
>     +----------------------------------------------+
>   1 |  -2.624668592   -2.847812143   -4.234106505  |
>   2 |  -2.154664963   -1.526056303   -2.288196356  |
>   3 |  -3.540959324   -1.931521412   -2.442347035  |
>   4 |  -3.135494216   -2.442347035   -3.540959324  |
>     +----------------------------------------------+
> 
> : sum(P:*lnP)+ln(M)
>   1.957469762
> 
> : end
> ==============================================================
> ================
> 
> 
> By the way, the strange ":*" in -sum(P:*lnP)+ln(M)- is Mata's 
> elementwise 
> multiplication operator.  P*lnP would be matrix multiplication.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index