A twist to this problem is that it is customary
in entropy calculations to define 0 ln 0 as 0.
However, Stata doesn't know that and will
return missing instead. So, it is necessary
to trap 0 within some expression as
scalar x = x + cond(p = 0, 0, p * ln(p))
Nick
n.j.cox@durham.ac.uk
Michael Blasnik
> -findit matmap- will lead you to the matmap package which can perform
> arbitrary elementwise operations on matrices, but since you
> want the sum of
> the calculation across all elements, I think you should just
> do the looping
> yourself. Here's one approach (untested):
>
> scalar x=0
> tab var1 var2, matcell(Cell)
> matrix P = Cell/r(N)
> local rows=rowsof(P)
> local cols=colsof(P)
> forvalues i=1(1)`rows' {
> forvalues j=1(1)`cols' {
> scalar p=P[`i',`j']
> scalar x=x+p*ln(p)
> }
> }
> scalar x=x+ln(`rows'*`cols')
> di x
>
> I've renamed your Proportions matrix as P for brevity and the
> result you
> want should be in scalar x when you are done. The code could
> be made more
> compact (by not calculating rows, cols, or p separately), but
> this approach
> is a little easier to understand. It would be simple to turn
> this into a
> Stata program (ado) file that accepts a variable list, -if- and -in-
> qualifiers, and perhaps other command options. I'm not
> familiar with this
> statistic, but perhaps someone has already done this.
Steve Vaisey
> > I am trying to implement the following formula for estimating the
> > (standardized) informational entropy of a RxC contingency table:
> > sum(p*ln(p))+ln(M) where p (subscript ij omitted) is the
> proportion of
> > cases in each cell and M is the total number of cells. So
> far, I've only
> > managed to accomplish the following:
> >
> > tab var1 var2, matcell(Cell)
> > matrix Proportions = Cell/r(N)
> >
> > Frankly, I'm amazed I've managed even this much, but now
> I'm stuck. What
> > I need to do (as I understand it) is take the natural log
> of each element
> > of the Proportions matrix, multiply it times its
> corresponding proportion,
> > sum up all the elements, and add ln(M).
> >
> > For some (probably good) reason, while you can easily
> multiply matrix
> > elements by a scalar, you can't do something like:
> >
> > matrix LnP = ln(Proportions)
> >
> > I've tried this, but it gives a type mismatch error.
> Again, I'm sure this
> > is for a good reason, but I'm not sure what to do.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/